Kokoro TTS

A CLI text-to-speech tool using the Kokoro model, supporting multiple languages, voices (with blending), and various input formats including EPUB books and PDF documents.

Features

Multiple language and voice support
Voice blending with customizable weights
EPUB, PDF and TXT file input support
Standard input (stdin) and | piping from other programs
Streaming audio playback
Split output into chapters
Adjustable speech speed
WAV and MP3 output formats
Chapter merging capability
Detailed debug output option
GPU Support

Demo

Kokoro TTS is an open-source CLI tool that delivers high-quality text-to-speech right from your terminal. Think of it as your personal voice studio, capable of transforming any text into natural-sounding speech with minimal effort.

demo.mp4

Demo Audio (MP3) | Demo Audio (WAV)

TODO

Add GPU support
Add PDF support
Add GUI

Prerequisites

Python 3.9-3.12 (Python 3.13+ is not currently supported)

Installation

Method 1: Install from PyPI (Recommended)

The easiest way to install Kokoro TTS is from PyPI:

# Using uv (recommended)
uv tool install kokoro-tts

# Using pip
pip install kokoro-tts

After installation, you can run:

kokoro-tts --help

Method 2: Install from Git

Install directly from the repository:

# Using uv (recommended)
uv tool install git+https://github.com/nazdridoy/kokoro-tts

# Using pip
pip install git+https://github.com/nazdridoy/kokoro-tts

Method 3: Clone and Install Locally

Clone the repository:

git clone https://github.com/nazdridoy/kokoro-tts.git
cd kokoro-tts

Install the package:

With uv (recommended):

uv venv
uv pip install -e .

With pip:

python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip install -e .

Run the tool:

# If using uv
uv run kokoro-tts --help

# If using pip with activated venv
kokoro-tts --help

Method 4: Run Without Installation

If you prefer to run without installing:

Clone the repository:

git clone https://github.com/nazdridoy/kokoro-tts.git
cd kokoro-tts

Install dependencies only:

With uv:

uv venv
uv sync

With pip:

python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip install -r requirements.txt

Run directly:

# With uv
uv run -m kokoro_tts --help

# With pip (venv activated)
python -m kokoro_tts --help

Download Model Files

After installation, download the required model files to your working directory:

# Download voice data (bin format is preferred)
wget https://github.com/nazdridoy/kokoro-tts/releases/download/v1.0.0/voices-v1.0.bin

# Download the model
wget https://github.com/nazdridoy/kokoro-tts/releases/download/v1.0.0/kokoro-v1.0.onnx

The script requires voices-v1.0.bin and kokoro-v1.0.onnx to be present in the same directory where you run the kokoro-tts command.

Supported voices:

Category	Voices	Language Code
🇺🇸 👩	af_alloy, af_aoede, af_bella, af_heart, af_jessica, af_kore, af_nicole, af_nova, af_river, af_sarah, af_sky	en-us
🇺🇸 👨	am_adam, am_echo, am_eric, am_fenrir, am_liam, am_michael, am_onyx, am_puck	en-us
🇬🇧	bf_alice, bf_emma, bf_isabella, bf_lily, bm_daniel, bm_fable, bm_george, bm_lewis	en-gb
🇫🇷	ff_siwis	fr-fr
🇮🇹	if_sara, im_nicola	it
🇯🇵	jf_alpha, jf_gongitsune, jf_nezumi, jf_tebukuro, jm_kumo	ja
🇨🇳	zf_xiaobei, zf_xiaoni, zf_xiaoxiao, zf_xiaoyi, zm_yunjian, zm_yunxi, zm_yunxia, zm_yunyang	cmn

Usage

Basic Usage

kokoro-tts <input_text_file> [<output_audio_file>] [options]

Note

If you installed via Method 1 (PyPI) or Method 2 (git install), use kokoro-tts directly
If you installed via Method 3 (local install), use uv run kokoro-tts or activate your virtual environment first
If you're using Method 4 (no install), use uv run -m kokoro_tts or python -m kokoro_tts with activated venv

Commands

-h, --help: Show help message
--help-languages: List supported languages
--help-voices: List available voices
--merge-chunks: Merge existing chunks into chapter files

Options

--stream: Stream audio instead of saving to file
--speed <float>: Set speech speed (default: 1.0)
--lang <str>: Set language (default: en-us)
--voice <str>: Set voice or blend voices (default: interactive selection)
- Single voice: Use voice name (e.g., "af_sarah")
- Blended voices: Use "voice1:weight,voice2:weight" format
--split-output <dir>: Save each chunk as separate file in directory
--format <str>: Audio format: wav or mp3 (default: wav)
--debug: Show detailed debug information during processing

Input Formats

.txt: Text file input
.epub: EPUB book input (will process chapters)
.pdf: PDF document input (extracts chapters from TOC or content)
- or /dev/stdin (Linux/macOS) or CONIN$ (Windows): Standard input (stdin)

Examples

# Basic usage with output file
kokoro-tts input.txt output.wav --speed 1.2 --lang en-us --voice af_sarah

# Read from standard input (stdin)
echo "Hello World" | kokoro-tts - --stream
cat input.txt | kokoro-tts - output.wav

# Cross-platform stdin support:
# Linux/macOS: echo "text" | kokoro-tts - --stream
# Windows: echo "text" | kokoro-tts - --stream
# All platforms also support: kokoro-tts /dev/stdin --stream (Linux/macOS) or kokoro-tts CONIN$ --stream (Windows)

# Use voice blending (60-40 mix)
kokoro-tts input.txt output.wav --voice "af_sarah:60,am_adam:40"

# Use equal voice blend (50-50)
kokoro-tts input.txt --stream --voice "am_adam,af_sarah"

# Process EPUB and split into chunks
kokoro-tts input.epub --split-output ./chunks/ --format mp3

# Stream audio directly
kokoro-tts input.txt --stream --speed 0.8

# Merge existing chunks
kokoro-tts --merge-chunks --split-output ./chunks/ --format wav

# Process EPUB with detailed debug output
kokoro-tts input.epub --split-output ./chunks/ --debug

# Process PDF and split into chapters
kokoro-tts input.pdf --split-output ./chunks/ --format mp3

# List available voices
kokoro-tts --help-voices

# List supported languages
kokoro-tts --help-languages

Tip

If you're using Method 3, replace kokoro-tts with uv run kokoro-tts in the examples above. If you're using Method 4, replace kokoro-tts with uv run -m kokoro_tts or python -m kokoro_tts in the examples above.

Features in Detail

EPUB Processing

Automatically extracts chapters from EPUB files
Preserves chapter titles and structure
Creates organized output for each chapter
Detailed debug output available for troubleshooting

Audio Processing

Chunks long text into manageable segments
Supports streaming for immediate playback
Voice blending with customizable mix ratios
Progress indicators for long processes
Handles interruptions gracefully

Output Options

Single file output
Split output with chapter organization
Chunk merging capability
Multiple audio format support

Debug Mode

Shows detailed information about file processing
Displays NCX parsing details for EPUB files
Lists all found chapters and their metadata
Helps troubleshoot processing issues

Input Options

Text file input (.txt)
EPUB book input (.epub)
Standard input (stdin)
Supports piping from other programs

Contributing

This is a personal project. But if you want to contribute, please feel free to submit a Pull Request.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgments

Kokoro-ONNX

kokoro-tts
Release 2.3.0

Release 2.3.0

2.3.0

2.2.1

2.1.1

Documentation

Kokoro TTS

Features

Demo

TODO

Prerequisites

Installation

Method 1: Install from PyPI (Recommended)

Method 2: Install from Git

Method 3: Clone and Install Locally

Method 4: Run Without Installation

Download Model Files

Supported voices:

Usage

Basic Usage

Commands

Options

Input Formats

Examples

Features in Detail

EPUB Processing

Audio Processing

Output Options

Debug Mode

Input Options

Contributing

License

Acknowledgments

Stats

Releases

Contributors

kokoro-tts Release 2.3.0

Release 2.3.0 Toggle Dropdown 2.3.0 2.2.1 2.1.1

Documentation

Kokoro TTS

Features

Demo

TODO

Prerequisites

Installation

Method 1: Install from PyPI (Recommended)

Method 2: Install from Git

Method 3: Clone and Install Locally

Method 4: Run Without Installation

Download Model Files

Supported voices:

Usage

Basic Usage

Commands

Options

Input Formats

Examples

Features in Detail

EPUB Processing

Audio Processing

Output Options

Debug Mode

Input Options

Contributing

License

Acknowledgments

Stats

Releases

Contributors

kokoro-tts
Release 2.3.0

Release 2.3.0

2.3.0

2.2.1

2.1.1