Text-to-Classic: A Diffusion Method for Classical Art Generation Based on Text

Yi Li

doi:10.54097/fcis.v3i1.6030

Authors

Yi Li

DOI:

https://doi.org/10.54097/fcis.v3i1.6030

Keywords:

Denoising Diffusion Probabilistic Model, Text-to-Image Generation, Art

Abstract

Text-to-Image generation has recently become a hot research topic and diffusion models have achieved remarkable performance in this task. However, most previous researches aim at real scene generation. Few researches focus on classical art paintings. Besides, diffusion models are commonly heavy-weighted with a large number of parameters, which has a high computational cost. In this paper, we aim to solve the classical art paintings synthesis subtask. We propose a lightweight diffusion model Text-to-Classic(T2C) to synthesize classical art paintings according to text descriptions. Experiment results show that our method can achieve good performance with fewer parameters.

Downloads

Download data is not yet available.

References

Saharia, C., Chan, W., Saxena, S., et al. (2022). Photorealistic text-to-image diffusion models with deep language understanding. Arxiv Preprint, 2205.11487.

Reed, S., Akata, Z., Yan, X., et al. (2016). Generative adversarial text to image synthesis. PMLR, 1060-1069.

Dash, A., Gamboa, J. C. B., Ahmed, S., et al. (2017). Tac-gan-text conditioned auxiliary classifier generative adversarial network. Arxiv Preprint, 1703.06412.

Zhang, H., Xu, T., Li, H., et al. (2017). Stackgan: Text to photo-realistic image synthesis with stacked generative adversarial networks. Proceedings of the IEEE international conference on computer vision, 5907-5915.

Yuan, M., Peng, Y. (2019). Bridge-GAN: Interpretable representation learning for text-to-image synthesis. IEEE Transactions on Circuits and Systems for Video Technology, 30(11):4258-4268.

Ho, J., Jain, A., Abbeel, P. (2020). Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems, 33: 6840-6851.

Dhariwal, P., Nichol, A. (2021). Diffusion models beat gans on image synthesis. Advances in Neural Information Processing Systems, 34:8780-8794.

Ho, J., Saharia, C., Chan, W., et al. (2022). Cascaded Diffusion Models for High Fidelity Image Generation. Journal of Machine Learning Research, 23(47):1-33.

Nichol, A., Dhariwal, P., Ramesh, A., et al. (2021). Glide: Towards photorealistic image generation and editing with text-guided diffusion models. Arxiv Preprint, 2112.10741.

Vaswani, A., Shazeer, N., Parmar, N., et al. (2017). Attention is all you need. Advances in neural information processing systems, 30.

Lin, T., Maire, M., Belongie, S., et al. (2014). Microsoft coco: Common objects in context. In : Computer Vision–ECCV 2014: 13th European Conference. Zurich, Switzerland. 13:740-755.

Garcia, N., Vogiatzis, G. (2018). How to read paintings: semantic art understanding with multi-modal retrieval. Proceedings of the European Conference on Computer Vision (ECCV) Workshops , 0-0.r

Text-to-Classic: A Diffusion Method for Classical Art Generation Based on Text

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

Issue

Section

License

How to Cite

Cover

CNKI Indexing

Keywords

Latest publications