About me

I am an Applied Scientist at Amazon Web Services. I currently work on AWS Neuron and develop optimizations to make state-of-the-art LLM models execute efficiently on AWS Trainium Chips.

I received my Ph.D. at New York University. I was advised by Professor Jinyang Li. My research focused on improving the execution performance of Deep Learning serving systems, particularly for Large Language Models.

Before joining NYU, I received my B.S. in Computer Science at Nanjing University, where I worked on improving performance of wireless network and was advised by Professor Wenzhong Li.

Here is my resume.

Education

  • Ph.D. in Computer Science
    New York University, New York, NY, United States

  • B.S. in Computer Science
    Nanjing University, Nanjing, Jiangsu, China

Selected Publications

  • SAS: Sparse Attention Synthesizer for Efficient Language Model Inference
    Yuan Zhou, Shaojie Xiang, Lingfan Yu, Zhenyu Song, Charith Mendis, Yida Wang
    Eurosys’26 (To Appear)
  • Stateful Large Language Model Serving with Pensieve [paper]
    Lingfan Yu, Jinkun Lin, Jinyang Li
    Eurosys’25
    Rotterdam, Netherlands, March 2025
  • Scalable Graph Neural Networks for Heterogeneous Graphs [paper]
    Lingfan Yu, Jiajun Shen, Jinyang Li, Adam Lerer
    ArXiv preprint arXiv:2011.09679
  • Deep Graph Library: Towards Efficient and Scalable Deep Learning on Graphs [paper]
    Minjie Wang, Lingfan Yu, Da Zheng, Quan Gan, Yu Gai, Zihao Ye, Mufei Li, Jinjing Zhou, Qi Huang, Chao Ma, Ziyue Huang, Qipeng Guo, Hao Zhang, Haibin Lin, Junbo Zhao, Jinyang Li, Alexander Smola and Zheng Zhang
    ICLR’19 Workshop on Representation Learning on Graphs and Manifolds
    New Orleans, USA, May 2019
  • Low Latency RNN Inference with Cellular Batching [paper]
    Pin Gao, Lingfan Yu (co-first author), Yongwei Wu, Jinyang Li.
    Eurosys’18
    Porto, Portugal, April 2018.
  • The Efficient Server Audit Problem, Deduplicated Re-execution, and the Web [paper]
    Cheng Tan, Lingfan Yu, Joshua B. Leners, Michael Walfish.
    SOSP’17 (Best paper award)
    Shanghai, China, October 2017

Work Experience

  • Amazon Web Services, Inc.
    2024.9 - Present
    Applied Scientist

  • Facebook, Inc. Facebook AI Research (FAIR)
    2020.5 - 2020.9
    Research Intern

  • Amazon Web Services, Inc.
    2019.6 - 2019.8
    Applied Scientist Intern