awesome-generative-ai

Text-to-Speech (TTS) Models

Curated collection of high-quality open-source TTS models and toolkits for research, production, and multi-language synthesis.


Table of Contents


High-Fidelity Models

Tortoise-TTS

VoiceCraft

VITS (Variational Inference TTS)

Coqui TTS

Bark

Maha TTS

MMS (Massively Multilingual Speech)

Vall-E X

StyleTTS2

SeamlessM4T


Fast and Efficient Models

Tacotron 2

FastSpeech 2

Glow-TTS

KittenTTS

Piper


Vocoders

Vocoders convert spectrograms to audio waveforms.

HiFi-GAN

WaveGlow

MelGAN


Notable TTS Projects

Fish-Speech

Kokoro

Llasa-TTS

Spark-TTS

Qwen3-TTS

ComfyUI-Qwen-TTS

VITS2

Index-TTS

Chatterbox

ChatTTS

FireRedTTS2

Genie-TTS

Supertonic

Granite Speech Models

TADA-TTS


Additional TTS Models

Extension Models


New Additions (Curated)

E2-TTS (e2-tts-pytorch)

F5-TTS

Delayed Streams Modeling / Kyutai

Chatterbox Fine-Tuning Kit

Emotional VITS

Matcha-TTS


Selection Guide

Use Case Recommended Model Why
High-quality synthesis Tortoise-TTS Best audio quality
Production deployment Coqui TTS Modular and well-documented
Real-time applications FastSpeech 2 Fast inference
Research projects VITS End-to-end and efficient
Multilingual support MMS, Vall-E X Extensive language coverage
Streaming applications Llasa-TTS Ultra-fast, streaming
Lightweight deployment VITS2 Small footprint
Ultra-lightweight/Edge KittenTTS <25MB, CPU-only
Local/Privacy-focused Piper Fast local synthesis
Voice cloning OpenVoice, Bark High-fidelity cloning

Voice Apps & Utilities

Voice


Additional Resources


Tip: Consider your use case (quality vs speed) and target platform when choosing a TTS model.