Speech

Speech

Maix Speech AI lib, including ASR, chat, TTS etc

17 October 2021

Speech

Bidirectional LSTM-CRF and ELMo for Named-Entity Recognition, Part-of-Speech Tagging and so on

17 October 2021

Speech

A lightweight speech recognition engine, specifically tuned for handheld and mobile devices, though it works equally well on the deskt

17 October 2021

Self-Supervised

Self Supervised Representation Learning With Deep Clustering For Acoustic Unit Discovery From Raw Speech

15 October 2021

Speech

AI grand challenge 2020 Repo (Speech Recognition Track)

15 October 2021

Deep Learning

Noise supression using deep filtering

13 October 2021

Speech

A Low Complexity Speech Enhancement Framework for Full-Band Audio (48kHz) based on Deep Filtering

13 October 2021

Speech

Light-SERNet: A lightweight fully convolutional neural network for speech emotion recognition

13 October 2021

Neural Network

Fast-rir: Fast Neural Diffuse Room Impulse Response Generator

FAST-RIR: FAST NEURAL DIFFUSE ROOM IMPULSE RESPONSE GENERATOR

13 October 2021

Speech

An easy way to create an Text-To-Speech request to Azure Speech and download the wav file Written in Python

12 October 2021

Speech

Persian Kaldi profile for Rhasspy built from open speech data

10 October 2021

Speech

Every Google, Azure & IBM text to speech voice for free

10 October 2021

Deep Learning

StrengthNet: Deep Learning-based Emotion Strength Assessment for Emotional Speech Synthesis

09 October 2021

Text-to-Speech

PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech

08 October 2021

Text-to-Speech

Official implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech

07 October 2021

Voice

Any-to-any voice conversion using synthetic specific-speaker speeches as intermedium features

07 October 2021

Text-to-Speech

Official implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech

07 October 2021

PyTorch

Microsoft's text-to-speech system FastSpeech 2: Fast and High-Quality End-to-End Text to Speech

06 October 2021

Speech

FastSpeech2 Refactored with python

06 October 2021

A Telegram bot to generate subtitles based on the speeches in medias

03 October 2021

Speech

LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search

01 October 2021

Localization

Sound Source Localization for AI Grand Challenge 2021

30 September 2021

Speech

Speech Resynthesis from Discrete Disentangled Self-Supervised Representations

30 September 2021

Speech

A framework for general speech restoration

29 September 2021

Dataset

A 10000+ hours dataset for Chinese speech recognition

28 September 2021

Voice

Voicefixer aims at the restoration of human speech regardless how serious its degraded

28 September 2021

Real Time

Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation

26 September 2021

Text-to-Speech

Unofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration

26 September 2021

Speech

WaveGlow: A Flow-based Generative Network for Speech Synthesis

23 September 2021

Deep Learning

Implementation of DeepSpeech2 for PyTorch using PyTorch Lightning

22 September 2021

Generator

Codename generator using WordNet parts of speech database

19 September 2021

Speech

A port of Coqui STT based on DeepSpeech to PyTorch

18 September 2021

Speech

Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition

12 September 2021

Raspberrypi

Control the classic General Instrument SP0256-AL2 speech chip and AY-3-8910 sound generator with Raspberry Pi

Control the classic General Instrument SP0256-AL2 speech chip and AY-3-8910 sound generator with a Raspberry Pi and this Python library

12 September 2021

Voice

A simple project to separate mixed voice (2 clean voices) to 2 separate voices

The simple project to separate mixed voice (2 clean voices) to 2 separate voices.

11 September 2021

Speech

PIKA: a lightweight speech processing toolkit based on Pytorch and (Py)Kaldi

10 September 2021

Text-to-Speech

EfficientTTS: An Efficient and High-Quality Text-to-Speech Architecture

09 September 2021

Real Time

A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement

09 September 2021

Deep Learning

A Deep Learning Loss Function based on Auditory Power Compression for Speech Enhancement

09 September 2021

Generator

A CLI application to generate subtitle file for any video using Mozilla DeepSpeech

AutoSub is a CLI application to generate subtitle file (.srt) for any video file using Mozilla DeepSpeech.

08 September 2021

Speech

Add a specified signal-to-noise ratio noise from MUSAN dataset to a pure speech signal in python

The purpose of this code base is to add a specified signal-to-noise ratio noise from MUSAN dataset to a pure speech signal and to generate far-field

08 September 2021

Speech

Free medium-quality text-to-speech software, VOICEVOX speech synthesis engine

07 September 2021

Annotation

A data annotation pipeline to generate high-quality, large-scale speech datasets

A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.

07 September 2021

Speech

Continuous Speech Separation with Conformer in python

We examine the use of the Conformer architecture for continuous speech separation. Conformer allows the separation model to efficiently capture both local and global context information, which is helpful for speech separation.

04 September 2021

PyTorch

Deep Gaussian process -based multi-speaker speech synthesis with PyTorch

This repository provides official implementation of deep Gaussian process (DGP)-based multi-speaker speech synthesis with PyTorch.

01 September 2021

Voice

Clone a voice in 5 seconds to generate arbitrary speech in real-time

supported mandarin and tested with multiple datasets: aidatatang_200zh, magicdata, aishell3

01 September 2021

Speech

Make your AirPlay devices as TTS speakers

Home Assistant integration component, make your AirPlay devices as TTS speakers.

31 August 2021

Speech

SpeechPy : A Library for Speech Processing and Recognition

29 August 2021

Speech

PyTorch implementation of Tacotron speech synthesis model

PyTorch implementation of Tacotron speech synthesis model.

27 August 2021

Speech

Open-sourced speech technology by Huawei Noah's Ark Lab

This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.

26 August 2021

A collection of 113 posts