Parler TTS
community
AI & ML interests
None defined yet.
Organization Card
About org cards
Parler-TTS
Parler-TTS is a lightweight text-to-speech (TTS) model that can generate high-quality, natural sounding speech in the style of a given speaker (gender, pitch, speaking style, etc). It is a reproduction of work from the paper Natural language guidance of high-fidelity text-to-speech with synthetic annotations by Dan Lyth and Simon King, from Stability AI and Edinburgh University respectively.
Contrary to other TTS models, Parler-TTS is a fully open-source release. All of the datasets, pre-processing, training code, and weights are released publicly under a permissive license, enabling the community to build on our work and develop their own powerful TTS models. It consists in:
- The Parler-TTS library for using and training high-quality TTS models.
- The Data-Speech repository, for annotating speech characteristics in a large-scale setting.
- This organization, that contains the released datasets and weights.
🚨 v0.1 model & demo out! Try it out here 🤗!
Collections
2
If you want to find out more about how these models were trained and even fine-tune them yourself, check-out the Parler-TTS repository on GitHub.
Open-source annotated speech datasets ranging from 1,000 hours to soon 50,000 hours.
datasets
6
parler-tts/mls-eng-10k-descriptions-v2-gemma
Viewer
•
Updated
parler-tts/images
Viewer
•
Updated
parler-tts/mls-eng-10k-tags_tagged_10k_generated
Viewer
•
Updated
•
871
•
9
parler-tts/libritts_r_tags_tagged_10k_generated
Viewer
•
Updated
•
2.29k
•
3
parler-tts/mls_eng_10k
Viewer
•
Updated
•
20k
•
9
parler-tts/mls_eng
Viewer
•
Updated
•
649
•
1