Script to generate VAD dataset used in Asteroid recipe

Oct 04, 2021 1 min read

About the dataset

LibriVAD is an open source dataset for voice activity detection in noisy environments. It is derived from LibriSpeech signals (clean subset) and DNS challenge noises.

Generating LibriVAD

You need to download LibriSpeech, the noise from the (datasets/noise) and the forced alignments.

To generate LibriVAD, clone the repo and run the main script : run.sh (edit run.sh with correct paths)

git clone https://github.com/JorisCos/LibriMix
cd LibriMix 
./run.sh storage_dir

GitHub

https://github.com/asteroid/team-Libri_VAD

John was the first writer to have joined pythonawesome.com. He has since then inculcated very effective writing and reviewing culture at pythonawesome which rivals have found impossible to imitate.