Research
Conference 2025, Duke University, August, 2025
We provide the first rigorous theoretical analysis of why Transformers underperform on time series forecasting in-context. Under AR(p) data, we prove that linear self-attention cannot beat classical linear predictors in expected MSE, show asymptotic recovery of the optimal linear predictor as context grows, and demonstrate exponential collapse under Chain-of-Thought inference.
Tags: Time Series Forecasting In-Context Learning Transformers Theory
Conference 2025, Duke University, August, 2025
A training-free acceleration method for generative models, combining a bounded second-order finite-difference extrapolator with a parity-aware reuse schedule. ZEUS achieves near-linear speedups up to 3.64× while maintaining high fidelity, working seamlessly across modalities, models, and schedulers.
Tags: Generative Model Diffusion Model Acceleration
AAAI 2026 (Submitted), Duke University, August, 2025
An end-to-end rank-aware streaming inference framework for SVD-compressed large language models. FlashSVD fuses low-rank projection kernels into attention and feed-forward pipelines, avoiding full-size activation buffers and reducing peak activation memory by up to 70.2% without accuracy loss.
Tags: Large Language Model Model Compression Memory Efficiency
NeurIPS 2025 (Submitted), Duke University, May, 2025
A theoretically grounded optimization method that enhances classical coordinate descent by unrolling updates across blocks and applying Taylor-based curvature correction. ECCD achieves up to 13× speedup and maintains sub-10⁻⁵ relative error across logistic and Poisson GLMs, outperforming glmnet, biglasso, and ncvreg on high-dimensional benchmarks.
Tags: Generalized Linear Model Block Coordinate Descent Stable Accelerating
ICML 2025, Duke University, January, 2025
A training-free diffusion acceleration framework that jointly exploits step-wise and token-wise sparsity via a unified stability criterion. SADA achieves ≥ 1.8× speedup while maintaining LPIPS ≤ 0.10 and FID ≤ 4.5, significantly outperforming prior methods on SD-2, SDXL, Flux, and ControlNet.
Tags: Generative Model Numerical Method Stable Accelerating
Bachelor's Thesis, University of Science and Technology of China, Mathematics Department, May, 2024
This is my bachelor’s thesis. In this thesis, I introduce the concept of the trimmed mean for partially observed functional data, prove the strong consistency of the estimator, and present results from simulation experiments.
Tags: Functional Data Trimmed Mean Data Depth Strong Consistency R Language
Research Experience
Since Summer 2025
I work at the Interpretable Machine Learning Lab, Duke University, under the supervision of Prof. Cynthia Rudin. My work focuses on developing scalable algorithms for \( \epsilon \)-optimal sparse regression trees with continuous features, leveraging fast rank-one Cholesky updates for efficient and numerically stable split evaluation, and investigating the theoretical guarantees and computational trade-offs of sparse lookahead strategies. This project is mentored by two Duke PhD students, Hayden McTavish and Varun Babbar.
I also work with Prof. Anru Zhang on the first rigorous in-context learning theory for time series forecasting. We proved that Transformers are fundamentally suboptimal compared with classical linear predictors, established a strictly positive excess-risk gap, quantified its \( \mathcal{O}(1/n) \) decay, and demonstrated exponential error compounding under Chain-of-Thought rollout. This collaboration has led to a co-first-authored paper. The project is mentored by Duke PhD student Yufa Zhou and is also under the guidance of Prof. Surbhi Goel at the University of Pennsylvania.
Since 2025
I collaborate with the Duke Center for Computational Evolutionary Intelligence (CEI), working with Prof. Yiran Chen. In this collaboration, I developed ZEUS, a controllable training-free diffusion acceleration framework achieving \(2\times\!-\!4\times\) speedups while maintaining perceptual fidelity (paper under review at ICLR 2026), and co-first-authored SADA (ICML 2025), a stability-guided training-free adaptive diffusion accelerator. This projects are collaborations with undergraduate student Justin Jiang, MS student Zishan Shao, and PhD student Hancheng Ye.
I also virtually collaborate with Prof. Aditya Devarakonda at Wake Forest University on optimization for elastic-net GLMs. We proposed and developed Enhanced Cyclic Coordinate Descent (ECCD), a block coordinate descent algorithm with second-order Taylor correction that delivers \(2\!-\!4\times\) speedups over the widely used glmnet
package while preserving accuracy and convergence guarantees. This project has resulted in a co-first-authored paper currently under review at NeurIPS 2025.
Acknowledgments
It has only been one year since I began learning machine learning and deep learning. Without the support and guidance of outstanding professors, mentors, and collaborators, none of the above research projects would have been possible. I am deeply grateful for their invaluable advice, encouragement, and inspiration throughout my research journey.