High-Fidelity Speech Synthesis with Minimal Supervision: All Using Diffusion Models

Chunyu Qiang, Hao Li, Yixin Tian, Yi Zhao, Ying Zhang, Longbiao Wang, Jianwu Dang


Tianjin University, Tianjin, China
Kuaishou Technology Co., Ltd, Beijing, China



Architecture

Overall Architecture


Speaker 1

Prompt Synthesized Speech

Speaker 2

Prompt Synthesized Speech

Speaker 3

Prompt Synthesized Speech

Speaker 4

Prompt Synthesized Speech

Speaker 5

Prompt Synthesized Speech