UPDF AI

Moûsai: Text-to-Music Generation with Long-Context Latent Diffusion

Flavio Schneider,Zhijing Jin,B. Schölkopf

2023 · DOI: 10.48550/arXiv.2301.11757
arXiv.org · 引用 71 次

TLDR

A cascading latent diffusion approach that can generate multiple minutes of high-quality stereo music at 48kHz from textual descriptions is developed, targeting real-time on a single consumer GPU.