Text-To-Speech (Robo-Shaul)

Welcome to the Robo-Shaul repository! This project enables you to train your own Robo-Shaul or use pre-trained models to convert Hebrew text into speech using the Tacotron 2 TTS framework.

Robo-Shaul was originally developed for a competition, where the winning model was trained for only 5k steps. After the competition, a more advanced model was trained for 90k steps using improved methodologies and a wider range of training data, resulting in significantly better performance.

🚀 Quick Start

Prerequisites

Python 3.10

Installation

Clone the repository:

git clone https://github.com/maxmelichov/Text-To-speech.git
cd Text-To-speech

Set up a virtual environment:

python3.10 -m venv venv
source venv/bin/activate  # Linux/Mac
# or
activate.bat  # Windows

Install dependencies:
```
pip install -r requirements.txt
```

Clone required submodules and dependencies:

git clone https://github.com/maxmelichov/tacotron2.git
git submodule init
git submodule update
git clone https://github.com/maxmelichov/waveglow.git
cp waveglow/glow.py ./

📁 Project Structure

The main directories used in this project are:

Text-To-speech/
├── data/                  # Place the SASPEECH dataset here
├── checkpoints/           # Stores Tacotron2 model checkpoints (*.pt files)
├── waveglow_weights/      # Stores WaveGlow model checkpoint (*.pt file)
├── tacotron2/             # Tacotron2 source code (cloned as submodule)
├── waveglow/              # WaveGlow source code (cloned as submodule)
├── ...

data/: Put your downloaded and preprocessed dataset here.
checkpoints/: Save and load Tacotron2 model weights (e.g., checkpoint_90000.pt).
waveglow_weights/: Place the WaveGlow model checkpoint file (e.g., waveglow_256channels.pt).

📦 Download Pre-trained Models

WaveGlow model: Download
Model with 90K steps: Download
Model with 5K steps: Download

📚 Dataset

Download the SASPEECH dataset from OpenSLR.

🛠️ Usage

Preprocess the data:
```
python data_preprocess.py
```
After running the script, ensure you generate a .txt file in the same format as the examples in the filelists directory:
```
path/to/audio.wav|transcript in Hebrew that using English letters
```
Train the model:
```
python train.py
```
Generate speech (inference):
```
python inference.py
```

💡 Demos & Resources

Live Demo: Project Site
Demo Page: here
Quick Start Notebook: Notebook |
Project Podcast: חיות כיס episode
Training & Synthesis Videos: Part 1 | Part 2

📝 Model Details

The system uses the SASPEECH dataset, a collection of unedited recordings from Shaul Amsterdamski for the 'Hayot Kis' podcast.
The TTS system is based on Nvidia's Tacotron 2, customized for Hebrew.

Note: The model expects diacritized Hebrew (עברית מנוקדת). For diacritization, we recommend Nakdimon (GitHub).

👥 Contact

Maxim Melichov	Tony Hasson
LinkedIn	LinkedIn

Feel free to reach out with questions or suggestions!

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
filelists		filelists
tacotron2		tacotron2
waveglow		waveglow
.DS_Store		.DS_Store
.gitignore		.gitignore
HebrewToEnglish.py		HebrewToEnglish.py
LICENSE		LICENSE
README.md		README.md
Tacotron_Synthesis_Notebook_contest_notebook.ipynb		Tacotron_Synthesis_Notebook_contest_notebook.ipynb
data_preprocess.py		data_preprocess.py
inference.py		inference.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Text-To-Speech (Robo-Shaul)

🚀 Quick Start

Prerequisites

Installation

📁 Project Structure

📦 Download Pre-trained Models

📚 Dataset

🛠️ Usage

💡 Demos & Resources

📝 Model Details

👥 Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

maxmelichov/Text-To-speech

Folders and files

Latest commit

History

Repository files navigation

Text-To-Speech (Robo-Shaul)

🚀 Quick Start

Prerequisites

Installation

📁 Project Structure

📦 Download Pre-trained Models

📚 Dataset

🛠️ Usage

💡 Demos & Resources

📝 Model Details

👥 Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages