EdiTTS: Score-based Editing for Controllable Text-to-Speech
This repository contains code for a user editable text-to-speech system, namely EdiTTS.
Create a Python virtual environment (
conda) and install package requirements as specified in
For more information, you may refer to the official repository of Grad-TTS.
Mark the part to be edited with ||.
In | the face of impediments confessedly discouraging |
We provide some examplary sentences in resources/filelists/edit_pitch_example.txt
To synthesize pitch-edited speech, run
CUDA_VISIBLE_DEVICES=0 python edit_pitch.py -f resources/filelists/edit_pitch_example.txt -c checkpts/grad-tts-old.pt -t 1000 -s out/pitch/wavs
Prepare two sentences. Concatenate them with # and mark the parts to be replaced with ||.
Three others subsequently | identified | Oswald from a photograph. #Three others subsequently | recognized | Oswald from a photograph.
We provide some examplary sentences in resources/filelists/edit_content_example.txt
To synthesize content-replaced speech, run
CUDA_VISIBLE_DEVICES=0 python edit_content.py -f resources/filelists/edit_content_example.txt -c checkpts/grad-tts-old.pt -t 1000 -s out/content/wavs
Audio samples generated by EdiTTS can be found here
This repository uses the following checkpoints.