Tianyu Guo's homepage
HomeAbout MeArchivesCategoriesTags
  • Tags
  • gLLM
Posted 2026-06-13Updated 2026-06-13Technology22 minutes read (About 3241 words)

gLLM 推出编码器分离(Encoder-Disaggregation):多模态推理吞吐再上台阶

开源分布式 LLM 推理系统 gLLM 新增「编码器分离」能力,将视觉编码器与语言模型解耦部署,在相同 GPU 预算下显著提升多模态服务的吞吐与延迟表现。

Read more
Tianyu Guo

Tianyu Guo

GPU architecture, MLSys, AI Infra

Guangzhou, China

Posts

17

Categories

3

Tags

19

Follow

Links

  • Google Scholarscholar.google.com
  • DBLPdblp.org
  • ORCIDorcid.org
  • Githubgithub.com
  • Zhihu(知乎)www.zhihu.com
  • LeetCodeleetcode.cn
  • CSDNblog.csdn.net

Recents

2026-06-13

AI 时代的个体思考

essay

2026-06-13

gLLM 推出编码器分离(Encoder-Disaggregation):多模态推理吞吐再上台阶

Technology

2024-04-22

Some thoughts on LLMs

Technology

2023-12-14

ebooks and tutorials

Technology

2023-12-04

Miscellaneous tips

Technology

Archives

  • 20262
  • 20241
  • 202314

Categories

  • Technology14
  • essay2
  • log1

Tags

AI1
GEMM1
GPU1
Hexo1
IR1
Icarus1
Inference1
LLM2
LLM Serving1
NVIDIA1
VLM1
cutlass7
ebook1
gLLM1
github1
tips1
tutorial1
一生一芯1
冰雹1
Tianyu Guo's homepage

© 2026 Tianyu Guo

×