Stability-guided  Adaptive  Diffusion  Acceleration

1ECE · 2Stat Sci · Duke University, Durham, U.S.A
justin.jiang, yixiao.wang, hancheng.ye, zishan.shao, jingwei.sun, jingyang.zhang, zekai.chen, jianyi.zhang, yiran.chen, hai.li@duke.edu
(* indicates equal contribution)
🏆 Appearing at ICML 2025   |   🚀 SADA plugs straight into any project built on HuggingFace Diffusers 🤗

We introduce Stability-guided  Adaptive  Diffusion  Acceleration ( SADA ), a training-free paradigm that accelerates sampling in diffusion and flow models by dynamically exploiting step-wise and token-wise sparsity. Our method achieves consistent ≥ 1.8× speedups on SD-2, SDXL, and Flux across EDM and DPM++ solvers, all with LPIPS ≤ 0.10 and FID ≤ 4.5. Moreover, SADA generalizes across modalities—achieving ∼ 1.81× acceleration on MusicLDM and ∼ 1.41× on ControlNet—without any fine-tuning.

Demo overview
Figure 1. With only 50 inference steps SADA accelerates Flux, SD‑XL and SD‑2 by up to 2.0× while maintaining image quality.

Motivation

Diffusion models are powerful but notoriously slow due to hundreds of iterative denoising steps and quadratic attention. Existing training-free accelerators often apply fixed sparsity patterns, which fail to adapt to prompt-specific denoising dynamics and lead to low faithfulness. SADA addresses this by introducing a unified stability-guided criterion that dynamically adjusts step-wise and token-wise sparsity based on the actual sampling trajectory.

SADA Motivation Diagram

Figure 2. SADA adapts sparsity based on its ODE trajectory to preserve fidelity during sampling.

Implementation

At each step, we check stability conditions from the ODE solver to decide whether to reuse cached outputs or perform fresh model evaluation. This enables adaptive acceleration with minimal overhead and strong compatibility with existing solvers like DPM++ and EDM.

SADA Main Implementation Diagram

Figure 3. Overview paradigm of SADA. The sparsity mode (middle: step-wise, bottom: token-wise) at timestep t – 1 is adaptively identified by the stability Criterion after fresh computation at timestep t. Note that “DP” in the pipeline stands for “Data Prediction”. Right: Visualization of SADA and baseline methods’ performance in terms of faithfulness and efficiency. Our method significantly outperforms existing baselines {DeepCache, AdaptiveDiffusion} on both metrics, using {SD-2 (Top), SDXL (Bottom)} with DPM-solver++ 50 steps.

Quantitative Results

Table 1. Quantitative results on MS‑COCO 2017.

Quantitative Results Table

Table 2. Ablation study of SADA with fewer steps under SD-2 and DPM++. Results on MS‑COCO 2017.

Ablation Study Table

Downstream Tasks and Data Modalities

MusicLDM Spectrograms

Figure 4. SADA deployment on MusicLDM on different text prompts. SADA accelerates MusicLDM by ∼ 1.81× while maintaining the spectrogram LPIPS under 0.020.

ControlNet Results

Figure 5. SADA deployment on ControlNet. We demonstrate the SD-1.5-based ControlNet pipeline trained on canny edges as conditional input. SADA accelerates ControlNet by ∼ 1.41× while preserving fidelity.

Acknowledgment

This material is based upon work supported by the U.S. National Science Foundation NSF logo under award No. 2112562. This work is also supported by ARO W911NF-23-2-0224 and NAIRR Pilot project NAIRR240270. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the U.S. National Science Foundation, ARO, NAIRR, and their contractors. In addition, we thank the area chair and reviewers for their valuable comments.

BibTeX

@inproceedings{jiang2025sada,
  title     = {SADA: Stability-guided Adaptive Diffusion Acceleration},
  author    = {Ting Jiang and Yixiao Wang and Hancheng Ye and Zishan Shao and Jingwei Sun and Jingyang Zhang and Zekai Chen and Jianyi Zhang and Yiran Chen and Hai Li},
  booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
  year      = {2025}
}