Research on Denoising Diffusion Probabilistic Models
DOI:
https://doi.org/10.54097/sxd49274Keywords:
Machine Learning; generative models; stochastic differential equation.Abstract
Diffusion models represent the latest state-of-the-art in the domain of deep generative models, boasting remarkable performance across a broad spectrum of applications. Despite the widespread success of diffusion models in various tasks, the original formulations of these models exhibit notable limitations. The article uses DDPM as an example, thoroughly and deeply exploring and deriving the mathematical principles of the model from two different perspectives. Additionally, this article explores the relationship between diffusion models and five other types of generative models: Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), Autoregressive models, Normalizing flows, and Energy-based models. Concluding with open questions for future research, the paper offers insights into the prospective algorithmic and application-oriented developments of diffusion models. Diffusion models have become a powerful framework capable of competing with Generative Adversarial Networks (GANs) in most applications without resorting to adversarial training. For specific tasks, understanding why and when diffusion models are more effective than other networks, and comprehending the differences between diffusion models and other generative models, will help clarify why diffusion models can produce high-quality samples with high likelihood.
Downloads
References
Ho J, Jain A, Abbeel P. Denoising Diffusion Probabilistic Models. In Proceedings of the 34th International Conference on Neural Information Processing Systems, 2020.
Ho J, et al. Improved Denoising Diffusion Probabilistic Models. Working paper, 2022.
Song Y, Ermon S. Diffusion Models: A Comprehensive Survey of Methods and Applications. Working paper, 2021.
Sohl-Dickstein J, Weiss E, Maheswaran than N, Ganguli S. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. In Proceedings of the 32nd International Conference on Machine Learning, 2015.
Kingma D P, Welling M. Auto-Encoding Variational Bayes. In Proceedings of the 2nd International Conference on Learning Representations (ICLR), 2013.
Goodfellow I, et al. Generative Adversarial Nets. In Advances in Neural Information Processing Systems, 2014, 2672 - 2680.
Vaswani A, et al. Attention is All You Need. In Advances in Neural Information Processing Systems, 2017, 30: 5998 - 6008.
Nichol, A. Q., et al. GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models. Working paper, 2021.
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. High-resolution image synthesis with latent diffusion models. In IEEE Conference on Computer Vision and Pattern Recognition, 2022, 10684 - 10695.
Levon Khachatryan, Andranik Movsisyan, Vahram Tadevosyan, Roberto Henschel, Zhangyang Wang, Shant Navasardyan, Humphrey Shi. Text2video-zero: Text-to-image diffusion models are zero-shot video generators. Working paper, 2023.
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Highlights in Science, Engineering and Technology

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.







