PyTorch Squeezeformer: An Efficient Transformer for Automatic Speech Recognition Squeezeformer: An Efficient Transformer for Automatic Speech Recognition 09 July 2022
Speech LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT 01 April 2022
Speech BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis 31 March 2022
Speech HiFi++: a Unified Framework for Neural Vocoding, Bandwidth Extension and Speech Enhancement HiFi++: a Unified Framework for Neural Vocoding, Bandwidth Extension and Speech Enhancement 31 March 2022
Speech STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation 18 March 2022
Voice A Python scripts for a speech processing pipeline with Voice Activity Detection (VAD) A Python scripts for a speech processing pipeline with Voice Activity Detection (VAD) 19 February 2022
Telegram Yet another Telegram Voice Recognition bot but using vosk and supports 20+ languages Yet another Telegram Voice Recognition bot but using vosk and supports 20+ languages 15 February 2022
Translation SHAS: Approaching optimal Segmentation for End-to-End Speech Translation SHAS: Approaching optimal Segmentation for End-to-End Speech Translation 14 February 2022
Converter Convert PDF to AudioBook and Audio Speech to PDF Convert PDF to AudioBook and Audio Speech to PDF 14 February 2022
Speech recognition How to make use of the Speech Recognition and pyttsx3 library of Python How to make use of the Speech Recognition and pyttsx3 library of Python 14 February 2022
PyTorch CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition 13 February 2022
Speech Speech Emotion Recognition with Fusion of Acoustic- and Linguistic-Feature-Based Decisions Speech Emotion Recognition with Fusion of Acoustic- and Linguistic-Feature-Based Decisions 11 February 2022
Text-to-Speech Creating an Audiobook (mp3 file) using a Ebook (epub) using BeautifulSoup and Google Text to Speech Creating an Audiobook (mp3 file) using a Ebook (epub) using BeautifulSoup and Google Text to Speech 30 January 2022
Discord Simple Text-To-Speech Bot For Discord made with python Simple Text-To-Speech Bot For Discord made with python 24 January 2022
Real Time A real-time speech emotion recognition application using Scikit-learn and gradio A real-time speech emotion recognition application using Scikit-learn and gradio 21 January 2022
Wrapper A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk 21 January 2022
Real Time TCNN Temporal convolutional neural network for real-time speech enhancement in the time domain TCNN Temporal convolutional neural network for real-time speech enhancement in the time domain 20 January 2022
Bot A bot that interacts with you over voice and sends mail.Uses speech_recognition,pyttsx3 and smtplib A bot that interacts with you over voice and sends mail.Uses speech_recognition,pyttsx3 and smtplib 18 January 2022
PyTorch PyTorch implementation of Lip to Speech Synthesis with Visual Context Attentional GAN (NeurIPS2021) PyTorch implementation of Lip to Speech Synthesis with Visual Context Attentional GAN (NeurIPS2021) 12 January 2022
Self-Supervised A self-supervised learning framework for audio-visual speech A self-supervised learning framework for audio-visual speech 09 January 2022
Speech Speech Recognition Database Management with python Speech Recognition Database Management with python 09 January 2022
API API for SpeechAnalytics integration with FreePBX/Asterisk API for SpeechAnalytics integration with FreePBX/Asterisk 08 January 2022
Speech ConferencingSpeech2022; Non-intrusive Objective Speech Quality Assessment (NISQA) Challenge ConferencingSpeech2022; Non-intrusive Objective Speech Quality Assessment (NISQA) Challenge 08 January 2022
Speech API for Speech Analytics integration with FreePBX/Asterisk API for Speech Analytics integration with FreePBX/Asterisk 08 January 2022
Speech Automatic speech recognition(ASR),中文语音识别 Automatic speech recognition(ASR),中文语音识别 04 January 2022
Transformer Unet-TTS: Improving Unseen Speaker and Style Transfer in One-shot Voice Cloning Unet-TTS: Improving Unseen Speaker and Style Transfer in One-shot Voice Cloning 02 January 2022
PyTorch A Pytorch implementation of GoogleBrain's SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recogni A Pytorch implementation of GoogleBrain's SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recogni 01 January 2022
Tensorflow Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition 21 December 2021
Text-to-Speech A really simple text-to-speech app made with python and tkinter A really simple text-to-speech app made with python and tkinter 21 December 2021
Voice Create light scenes , voice control, ifttt, fuzzywuzzy speech correction and much more with Tuya light bulbs Create light scenes , voice control, ifttt, fuzzywuzzy speech correction and much more with Tuya light bulbs 20 December 2021
Text-to-Speech DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022 DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022 19 December 2021
BERT PhoNLP: A BERT-based multi-task learning toolkit for part-of-speech tagging, named entity recognition and dependency parsing PhoNLP: A BERT-based multi-task learning toolkit for part-of-speech tagging, named entity recognition and dependency parsing 29 November 2021
Generator Combine Tacotron2 and Hifi GAN to generate speech from text Combine Tacotron2 and Hifi GAN to generate speech from text 28 November 2021
Speech Efficient Speech Processing Tookit for Automatic Speaker Recognition Efficient Speech Processing Tookit for Automatic Speaker Recognition 24 November 2021
Speech A CSRankings-like index for speech researchers A CSRankings-like index for speech researchers 21 November 2021
Speech Self-Adaptive noise distribution network for Speech Enhancement with heterogeneous data of Cross-Silo Federated learning Self-Adaptive noise distribution network for Speech Enhancement with heterogeneous data of Cross-Silo Federated learning 17 November 2021
Speech A simple Speech Emotion Recognition (SER) API created using Flask and running in a Docker container A simple Speech Emotion Recognition (SER) API created using Flask and running in a Docker container 10 November 2021
Speech A Python library provides common speech features for ASR including MFCCs and filterbank energies A Python library provides common speech features for ASR including MFCCs and filterbank energies 04 November 2021
Speech using speech recognition change .wav file to .txt file using speech recognition change .wav file to .txt file 30 October 2021
Speech LDNet: Unified Listener Dependent Modeling in MOS Prediction for Synthetic Speech LDNet: Unified Listener Dependent Modeling in MOS Prediction for Synthetic Speech 24 October 2021
GUI Minimal GUI for accessing the Watson Text to Speech service Minimal GUI for accessing the Watson Text to Speech service 23 October 2021
PyTorch Recognition of 38 speech commands in russian. Based on Yandex Cup 2021 ML Challenge: ASR Speech_38_ru_commands Recognition of 38 speech commands in russian. Based on Yandex Cup 2021 ML Challenge: ASR Программа умеет распознавать 38 ключевых слов на русском языке , произнесенных в микрофон из списка: 23 October 2021
Speech To Text Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge 20 October 2021
Voice A cappella: Audio-visual Singing VoiceSeparation, from BMVC21 A cappella: Audio-visual Singing VoiceSeparation, from BMVC21 20 October 2021
Speech CPC-big and k-means clustering for zero-resource speech processing CPC-big and k-means clustering for zero-resource speech processing 18 October 2021