Capability
Latent Space Video Diffusion With Iterative Denoising
20 artifacts provide this capability.
Want a personalized recommendation?
Find the best match →Top Matches
via “latent-space diffusion with unet denoising backbone”
text-to-image model by undefined. 8,66,496 downloads.
Unique: Combines a VAE encoder (compressing 512×512 images to 64×64 latents with 4× spatial downsampling) with a UNet denoiser trained on latent-space noise prediction, enabling efficient inference while maintaining image quality through learned latent representations.
vs others: Latent-space diffusion is ~16× more memory-efficient than pixel-space diffusion (e.g., LDM vs DDPM) and enables single-step generation via distillation, which is impossible in pixel space due to the curse of dimensionality.