Adaptive Text to Speech with Untranscribed Data

Aug 07, 2021 1 min read

AdaSpeech 2

Adaptive Text to Speech with Untranscribed Data [WIP]

Requirements :

All code written in Python 3.6.2 .

Install Pytorch

Before installing pytorch please check your Cuda version by running following command :
nvcc --version

pip install torch torchvision

In this repo I have used Pytorch 1.6.0 for torch.bucketize feature which is not present in previous versions of PyTorch.

Installing other requirements :

pip install -r requirements.txt

To use Tensorboard install tensorboard version 1.14.0 seperatly with supported tensorflow (1.14.0)

For Preprocessing :

filelists folder contains MFA (Motreal Force aligner) processed LJSpeech dataset files so you don't need to align text with audio (for extract duration) for LJSpeech dataset.
For other dataset follow instruction here. For other pre-processing run following command :

python nvidia_preprocessing.py -d path_of_wavs

For finding the min and max of F0 and Energy

python compute_statistics.py

Update the following in hparams.py by min and max of F0 and Energy

p_min = Min F0/pitch
p_max = Max F0
e_min = Min energy
e_max = Max energy

Training :

[WIP]

Citations :

@misc{chen2021adaspeech,
      title={AdaSpeech: Adaptive Text to Speech for Custom Voice}, 
      author={Mingjian Chen and Xu Tan and Bohan Li and Yanqing Liu and Tao Qin and Sheng Zhao and Tie-Yan Liu},
      year={2021},
      eprint={2103.00993},
      archivePrefix={arXiv},
      primaryClass={eess.AS}
}

@misc{yan2021adaspeech,
      title={AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data}, 
      author={Yuzi Yan and Xu Tan and Bohan Li and Tao Qin and Sheng Zhao and Yuan Shen and Tie-Yan Liu},
      year={2021},
      eprint={2104.09715},
      archivePrefix={arXiv},
      primaryClass={cs.SD}
}

GitHub

https://github.com/rishikksh20/AdaSpeech2

John was the first writer to have joined pythonawesome.com. He has since then inculcated very effective writing and reviewing culture at pythonawesome which rivals have found impossible to imitate.

Text-to-Speech

TTS: Deep learning for Text to Speech

TTS is a library for advanced Text-to-Speech generation.

15 August 2021

Text-to-Speech

Text Normalization with Limited Data for Fast Deployment in Text to Speech Systems

07 August 2021

Text-to-Speech

Attention Based Grapheme To Phoneme with python

The G2P algorithm is used to generate the most probable pronunciation for a word not contained in the lexicon dictionary.

26 June 2021

PyTorch

WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis

PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis.

26 June 2021

Jupyter notebooks

UmlsBERT: Augmenting Contextual Embeddings with a Clinical Metathesaurus

UmlsBERT: Clinical Domain Knowledge Augmentation of Contextual Embeddings Using the Unified Medical Language System Metathesaurus

21 June 2021

Natural Language Processing

A Collaborative Repository of Natural Language Transformations

The NL-Augmenter is a collaborative effort intended to add transformations of datasets dealing with natural language.

20 June 2021

Text-to-Speech

A Simple Python Program Which Converts Your Text to Speech

Text-to-Speech-Converter This is Simple Python Program Which Converts Your Text to Speech. Requirements First install Pyttsx3 using command : pip install pyttsx3 what does code provided do The code Provided in text_to_speech_

16 January 2023

Text-to-Speech

Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform

Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform Masaya Kawamura, Yuma Shirahata, Ryuichi Yamamoto, Kentaro Tachibana We propose a lightweight end-to-end text-to-speech model using multi-band generation and inverse

16 January 2023

Adaptive Text to Speech with Untranscribed Data

AdaSpeech 2

Requirements :

For Preprocessing :

Training :

Citations :

GitHub

John

Certified Robustness to Adversarial Word Substitutions in python

SMPLpix: Neural Avatars from 3D Human Models

AdaSpeech 2

Requirements :

For Preprocessing :

Training :

Citations :

GitHub

Certified Robustness to Adversarial Word Substitutions in python

SMPLpix: Neural Avatars from 3D Human Models

You might also like...