Projects

Ongoing

National Key R&D Program of China (Youth Scientist Program): Foundations of Machine Learning for Understanding Large Models (2025YFA1018600)

The project focuses on the Theoretical Foundations of Large-scale Machine Learning Models. The project is led by Jun Shu at Xi’an Jiaotong University and jointly conducted with Hongxin Wei and Zeng Li at SUSTech, as well as Zenan Ling at HUST.

NSFC-12571561 (General Program): Compression of Large-scale Transformer-based Models with Random Matrix Methods: From Scaling Law to Dynamic Adaptive Model Compression


Previous

NSFC-62206101 (Youth Program): Fundamental Limits of Pruning Deep Neural Network Models via Random Matrix Methods

This project (2023.01–2025.12), led by myself, investigates the fundamental theoretical limits of pruning and quantization in deep neural networks. The main objective is to develop a quantitative theoretical framework—grounded in random matrix theory, high-dimensional statistics, and optimization theory—to rigorously characterize the trade-off between model performance and computational complexity in modern deep neural architectures.

Guangdong Key Lab of Mathematical Foundations for Artificial Intelligence Open Fund OFA00003: Generalization Theory for Transformer-based Models via Random Matrix Methods

This project (2024–2026), for which I serve as the PI (with Prof. Jeff Yao as co-PI), aims to advance the theoretical understanding of generalization in Transformer-based models. By leveraging tools from random matrix theory and high-dimensional probability, the project seeks to develop a principled framework to characterize the memorization and generalization behavior and scaling laws of modern Attention-based architectures.

CCF-Hikvision Open Fund 20210008: Random Matrix Theory and Information Bottleneck for Neural Network Compression

This project is led by Prof. Kai Wan and myself as PI, and investigates efficient compression schemes of large-scale neural network models with strong theoretical guarantees.