Score Distillation Beyond Acceleration: Generative Modeling from Corrupted Data

  • Yasi Zhang ,
  • Tianyu Chen ,
  • Zhendong Wang ,
  • Yingnian Wu ,
  • Mingyuan Zhou ,
  • Oscar Leong

ICLR 2026 |

Publication

Learning generative models directly from corrupted observations is a long-standing challenge across natural and scientific domains. We introduce *Distillation from Corrupted Data (DCD)*, a unified framework for learning high-fidelity, one-step generative models using **only** degraded data of the form \(y = \mathcal{A}(x) + \sigma \varepsilon, \ x\sim p_X,\ \varepsilon\sim \mathcal{N}(0,I_m),\) where the mapping \(\mathcal{A}\) may be the identity or a non-invertible corruption operator (e.g., blur, masking, subsampling, Fourier acquisition). DCD first pretrains a *corruption-aware diffusion teacher* on the observed measurements, then *distills* it into an efficient one-step generator whose samples are statistically closer to the clean distribution \(p_X\). The framework subsumes identity corruption (denoising task) as a special case of our general formulation. Empirically, DCD consistently reduces Fréchet Inception Distance (FID) relative to corruption-aware diffusion teachers across noisy generation (*CIFAR-10*, *FFHQ*, *CelebA-HQ*, *AFHQ-v2*), image restoration (Gaussian deblurring, random inpainting, super-resolution, and mixtures with additive noise), and multi-coil MRI—*without access to any clean images*. The distilled generator inherits one-step sampling efficiency, yielding up to \(30\times\) speedups over multi-step diffusion while surpassing the teachers after substantially fewer training iterations. These results establish score distillation as a practical tool for generative modeling from corrupted data, *not merely for acceleration*. We also provide theoretical support for the use of distillation in enhancing generation quality in the Appendix.