Research

Why Do Transformers Fail to Forecast Time Series In-Context?

Conference 2025, Duke University, August, 2025

We provide the first rigorous theoretical analysis of why Transformers underperform on time series forecasting in-context. Under AR(p) data, we prove that linear self-attention cannot beat classical linear predictors in expected MSE, show asymptotic recovery of the optimal linear predictor as context grows, and demonstrate exponential collapse under Chain-of-Thought inference.

Tags: Time Series Forecasting In-Context Learning Transformers Theory

ZEUS: Zero-shot Efficient Unified Sparsity for Generative Models

Conference 2025, Duke University, August, 2025

A training-free acceleration method for generative models, combining a bounded second-order finite-difference extrapolator with a parity-aware reuse schedule. ZEUS achieves near-linear speedups up to 3.64× while maintaining high fidelity, working seamlessly across modalities, models, and schedulers.

Tags: Generative Model Diffusion Model Acceleration

FlashSVD: Memory-Efficient Inference with Streaming for Low-Rank Models

AAAI 2026 (Submitted), Duke University, August, 2025

An end-to-end rank-aware streaming inference framework for SVD-compressed large language models. FlashSVD fuses low-rank projection kernels into attention and feed-forward pipelines, avoiding full-size activation buffers and reducing peak activation memory by up to 70.2% without accuracy loss.

Tags: Large Language Model Model Compression Memory Efficiency

Enhanced Cyclic Coordinate Descent Methods for Elastic Net Penalized Linear Models

NeurIPS 2025 (Submitted), Duke University, May, 2025

A theoretically grounded optimization method that enhances classical coordinate descent by unrolling updates across blocks and applying Taylor-based curvature correction. ECCD achieves up to 13× speedup and maintains sub-10⁻⁵ relative error across logistic and Poisson GLMs, outperforming glmnet, biglasso, and ncvreg on high-dimensional benchmarks.

Tags: Generalized Linear Model Block Coordinate Descent Stable Accelerating

SADA: Stability-guided Adaptive Diffusion Acceleration

ICML 2025, Duke University, January, 2025

A training-free diffusion acceleration framework that jointly exploits step-wise and token-wise sparsity via a unified stability criterion. SADA achieves ≥ 1.8× speedup while maintaining LPIPS ≤ 0.10 and FID ≤ 4.5, significantly outperforming prior methods on SD-2, SDXL, Flux, and ControlNet.

Tags: Generative Model Numerical Method Stable Accelerating

TMfPOFD: Trimmed Mean for Partially Observed Functional Data

Bachelor's Thesis, University of Science and Technology of China, Mathematics Department, May, 2024

This is my bachelor’s thesis. In this thesis, I introduce the concept of the trimmed mean for partially observed functional data, prove the strong consistency of the estimator, and present results from simulation experiments.

Tags: Functional Data Trimmed Mean Data Depth Strong Consistency R Language

Yixiao Wang