Profile Picture

Tianyi Zhang

PhD Candidate in Computer Science

Rice University

Also known as Tony  |  Native name: 张天翼

Email: [first_initial][last_initial]21 @ rice.edu

About Me

I am a PhD candidate in Computer Science at Rice University, advised by Prof. Anshumali Shrivastava. I work on making Large Language Models (LLMs) and foundation models more efficient, accurate, and accessible. My research has been published at top-tier conferences such as NeurIPS, ICML, ICLR, and EMNLP. My open-source contributions have been adopted by a growing community of users.

I pioneered a lossless LLM compression technique that reduces model size by 30% while preserving bit-for-bit identical outputs and enabling efficient GPU inference. This work reached #1 on Hacker News and my models on Hugging Face receive thousands of monthly downloads.

Before Rice, I earned my undergraduate degree with distinction in Computer Science from the University of Waterloo.

Research Interests

Selected Publications

70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float

Tianyi Zhang, Yang Sui, Shaochen Zhong, Vipin Chaudhary, Xia Hu, Anshumali Shrivastava

Preprint

Sketch to Adapt: Fine-Tunable Sketches for Efficient LLM Adaptation

Tianyi Zhang*, Junda Su*, Aditya Desai, Oscar Wu, Zhaozhuo Xu, Anshumali Shrivastava

ICML 2025

LeanQuant: Accurate and Scalable Large Language Model Quantization with Loss-error-aware Grid

Tianyi Zhang, Anshumali Shrivastava

ICLR 2025

KV Cache is 1 Bit Per Channel: Efficient Large Language Model Inference with Coupled Quantization

Tianyi Zhang, Jonah Yi, Zhaozhuo Xu, Anshumali Shrivastava

NeurIPS 2024

NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free Attention

Tianyi Zhang, Jonah Yi, Bowen Yao, Zhaozhuo Xu, Anshumali Shrivastava

NeurIPS 2024

Education

Ph.D. in Computer Science (2021 - Expected 2025)

Rice University

Advisor: Prof. Anshumali Shrivastava

B.S. in Computer Science (2016 - 2021)

University of Waterloo