A diffusion model is trained to estimate the noise in noisy data \(\mathbf{x}_t\) conditioned on time step \(t\). The noisy data \(\mathbf{x}_t\) is expected to have the form \[\mathbf{x}_t = w_t^{\text{signal}}\underbrace{\mathbf{x}_0}_{\text{signal}} + w_t^{\text{noise}}\underbrace{\epsilon\vphantom{\mathbf{x}_0}}_{\text{noise}}.\] That is, \(\mathbf{x}_t\) is a weighted average of pure signal \