Tianyu Guo 郭天宇
E-mail : guoty9[at]mail2.sysu.edu.cn
About
I’m a third-year PH.D. student of Computer Science and Technology at Sun Yat-Sen University co-advised by Assoc. Prof. Xianwei Zhang and Prof. Nong Xiao . I completed bachelor degree at Xidian University. My reasearch insterest lies in GPU architecture,MLSys and AI Infra. I’m also passionate about the open source community (Check out my projects/PRs). You can also have a look at my RESUME for more details.
Publications
[arXiv] [Github]
gLLM: Global Balanced Pipeline Parallelism System for Distributed LLM Serving with Token Throttling
Tianyu Guo, Xianwei Zhang, Jiangsu Du, Zhiguang Chen, Nong Xiao, Yutong Lu[Euro-Par’25] [Github]
EFIM: Efficient Serving of LLMs for Infilling Tasks with Improved KV Cache Reuse
Tianyu Guo, Hande Dong, Yichong Leng, Feng Liu, Cheater Lin, Nong Xiao and Xianwei Zhang, The 31st International European Conference on Parallel and Distributed Processing, Dresden, Germany, August 2025.[ASP-DAC’25] [DOI]
Mpache: Interaction Aware Multi-level Cache Bypassing on GPUs
Mengyue Xi, Tianyu Guo, Xuanteng Huang, Zejia Lin, Xianwei Zhang, The 30th Asia and South Pacific Design Automation Conference, Tokyo Odaiba Miraikan, Japan, January 2025.[DAC’24] [DOI] [Slide]
SMILE: LLC-based Shared Memory Expansion to Improve GPU Thread Level Parallelism
Tianyu Guo, Xuanteng Huang, Kan Wu, Xianwei Zhang and Nong Xiao, The 61st ACM/IEEE Design Automation Conference, San Francisco, CA, United States, June 2024.
Experience
- Research intern at Tencent
- Participate and win the 2nd prize of A3 track in the 1st ACTIC
- [Preliminary]/[Final] Presentations and [Technical Report]
- Operator implementation and performance optimization with vector instruction set
- Teaching Assistant of “SYSU-DCS3013 : Computer Architecture”
- Release [SYSU-ARCH LAB] that focuses on the use and extending of simulators
Projects
Presentations & HW & Dissertation
Weekly Paper Sharing SC23 “Frontier: Exploring Exascale”
Weekly Paper Sharing MLSYS23 “AUTOSCRATCH: ML-OPTIMIZED CACHE MANAGEMENT FOR INFERENCE-ORIENTED GPUS”
Weekly Paper Sharing HPCA23 “DIMM-Link: Enabling Efficient Inter-DIMM Communication for Near-Memory Processing”
AI final Homework “A Convolutional Neural Network Framework support on CPU and GPU”
Bachelor’s dissertation “General Computing optimization for GPU based on Cache management”
Tianyu Guo 郭天宇