publications

2026

ICLR

TileLang: Bridge Programmability and Performance in Modern Neural Kernels

Lei Wang, Yu Cheng, Yining Shi, and 9 more authors

The Fourteenth International Conference on Learning Representations (ICLR 2026), 2026

Oral Presentation
ICLR

Sparse Attention Adaptation for Long Reasoning

Yizhao Gao, Shuming Guo, Shijie Cao, and 12 more authors

The Fourteenth International Conference on Learning Representations (ICLR 2026), 2026
PPoPP

MetaAttention: A Unified and Performant Attention Framework across Hardware Backends

Feiyang Chen, Yu Cheng, Lei Wang, and 8 more authors

In Proceedings of the 31st ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2026
Arxiv

MiMo-V2-Flash Technical Report

Bangjun Xiao, Bingquan Xia, Bo Yang, and 122 more authors

arXiv preprint arXiv:2601.02780, 2026

2025

OSDI

PipeThreader: Software-Defined Pipelining for Efficient DNN Execution

Yu Cheng, Lei Wang, Yining Shi, and 9 more authors

The 19th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’25), 2025
Arxiv

HySparse: A Hybrid Sparse Attention Architecture with Oracle Token Selection and KV Cache Sharing

Yizhao Gao, Jianyu Wei, Qihao Zhang, and 11 more authors

arXiv preprint arXiv:2602.03560, 2025
EuroSys

NeuStream: Bridging Deep Learning Serving and Stream Processing

Haochen Yuan, Yuanqing Wang, Wenhao Xie, and 5 more authors

The 20th European Conference on Computer Systems (EuroSys’25), 2025

2024

SOSP

Scaling Deep Learning Computation over the Inter-Core Connected Intelligence Processor

Yiqi Liu, Yuqi Xue, Yu Cheng, and 4 more authors

30th ACM Symposium on Operating Systems Principles (SOSP 2024), 2024

2023

SIGMOD

GoldMiner: Elastic Scaling of Training Data Pre-Processing Pipelines for Deep Learning

Hanyu Zhao, Zhi Yang, Yu Cheng, and 8 more authors

Proceedings of the ACM on Management of Data, 2023

2022

ICDE

Zoomer: Boosting retrieval on web-scale graphs by regions of interest

Yuezihan Jiang, Yu Cheng, Hanyu Zhao, and 6 more authors

In 2022 IEEE 38th International Conference on Data Engineering (ICDE), 2022