Research

Why Do Transformers Fail to Forecast Time Series In-Context?

Conference 2025, Duke University, August, 2025

We provide the first rigorous theoretical analysis of why Transformers underperform on time series forecasting in-context. Under AR(p) data, we prove that linear self-attention cannot beat classical linear predictors in expected MSE, show asymptotic recovery of the optimal linear predictor as context grows, and demonstrate exponential collapse under Chain-of-Thought inference.

Tags: Time Series Forecasting In-Context Learning Transformers Theory

ZEUS: Zero-shot Efficient Unified Sparsity for Generative Models

Conference 2025, Duke University, August, 2025

A training-free acceleration method for generative models, combining a bounded second-order finite-difference extrapolator with a parity-aware reuse schedule. ZEUS achieves near-linear speedups up to 3.64× while maintaining high fidelity, working seamlessly across modalities, models, and schedulers.

Tags: Generative Model Diffusion Model Acceleration

FlashSVD: Memory-Efficient Inference with Streaming for Low-Rank Models

AAAI 2026 (Submitted), Duke University, August, 2025

An end-to-end rank-aware streaming inference framework for SVD-compressed large language models. FlashSVD fuses low-rank projection kernels into attention and feed-forward pipelines, avoiding full-size activation buffers and reducing peak activation memory by up to 70.2% without accuracy loss.

Tags: Large Language Model Model Compression Memory Efficiency

Enhanced Cyclic Coordinate Descent Methods for Elastic Net Penalized Linear Models

NeurIPS 2025 (Submitted), Duke University, May, 2025

A theoretically grounded optimization method that enhances classical coordinate descent by unrolling updates across blocks and applying Taylor-based curvature correction. ECCD achieves up to 13× speedup and maintains sub-10⁻⁵ relative error across logistic and Poisson GLMs, outperforming glmnet, biglasso, and ncvreg on high-dimensional benchmarks.

Tags: Generalized Linear Model Block Coordinate Descent Stable Accelerating

SADA: Stability-guided Adaptive Diffusion Acceleration

ICML 2025, Duke University, January, 2025

A training-free diffusion acceleration framework that jointly exploits step-wise and token-wise sparsity via a unified stability criterion. SADA achieves ≥ 1.8× speedup while maintaining LPIPS ≤ 0.10 and FID ≤ 4.5, significantly outperforming prior methods on SD-2, SDXL, Flux, and ControlNet.

Tags: Generative Model Numerical Method Stable Accelerating

TMfPOFD: Trimmed Mean for Partially Observed Functional Data

Bachelor's Thesis, University of Science and Technology of China, Mathematics Department, May, 2024

This is my bachelor’s thesis. In this thesis, I introduce the concept of the trimmed mean for partially observed functional data, prove the strong consistency of the estimator, and present results from simulation experiments.

Tags: Functional Data Trimmed Mean Data Depth Strong Consistency R Language

Research Experience

Since Summer 2025

I work at the Interpretable Machine Learning Lab, Duke University, under the supervision of Prof. Cynthia Rudin. My work focuses on developing scalable algorithms for \( \epsilon \)-optimal sparse regression trees with continuous features, leveraging fast rank-one Cholesky updates for efficient and numerically stable split evaluation, and investigating the theoretical guarantees and computational trade-offs of sparse lookahead strategies. This project is mentored by two Duke PhD students, Hayden McTavish and Varun Babbar.

I also work with Prof. Anru Zhang on the first rigorous in-context learning theory for time series forecasting. We proved that Transformers are fundamentally suboptimal compared with classical linear predictors, established a strictly positive excess-risk gap, quantified its \( \mathcal{O}(1/n) \) decay, and demonstrated exponential error compounding under Chain-of-Thought rollout. This collaboration has led to a co-first-authored paper. The project is mentored by Duke PhD student Yufa Zhou and is also under the guidance of Prof. Surbhi Goel at the University of Pennsylvania.

Since 2025

I collaborate with the Duke Center for Computational Evolutionary Intelligence (CEI), working with Prof. Yiran Chen. In this collaboration, I developed ZEUS, a controllable training-free diffusion acceleration framework achieving \(2\times\!-\!4\times\) speedups while maintaining perceptual fidelity (paper under review at ICLR 2026), and co-first-authored SADA (ICML 2025), a stability-guided training-free adaptive diffusion accelerator. This projects are collaborations with undergraduate student Justin Jiang, MS student Zishan Shao, and PhD student Hancheng Ye.

I also virtually collaborate with Prof. Aditya Devarakonda at Wake Forest University on optimization for elastic-net GLMs. We proposed and developed Enhanced Cyclic Coordinate Descent (ECCD), a block coordinate descent algorithm with second-order Taylor correction that delivers \(2\!-\!4\times\) speedups over the widely used glmnet package while preserving accuracy and convergence guarantees. This project has resulted in a co-first-authored paper currently under review at NeurIPS 2025.

Acknowledgments

It has only been one year since I began learning machine learning and deep learning. Without the support and guidance of outstanding professors, mentors, and collaborators, none of the above research projects would have been possible. I am deeply grateful for their invaluable advice, encouragement, and inspiration throughout my research journey.