A collection of command-line tools for offline speech/intent recognition

voice2json

voice2json is a collection of command-line tools for offline speech/intent recognition on Linux. It is free, open source (MIT), and supports 18 human languages.

From the command-line:

$ voice2json -p en transcribe-wav \
      < turn-on-the-light.wav | \
      voice2json -p en recognize-intent | \
      jq .

produces a JSON event like:

{
    "text": "turn on the light",
    "intent": {
        "name": "LightState"
    },
    "slots": {
        "state": "on"
    }
}

when trained with this template:

[LightState]
states = (on | off)
turn (<states>){state} [the] light

voice2json is optimized for:

Sets of voice commands that are described well by a grammar
Commands with uncommon words or pronunciations
Commands or intents that can vary at runtime

It can be used to:

Add voice commands to existing applications or Unix-style workflows
Provide basic voice assistant functionality completely offline on modest hardware
Bootstrap more sophisticated speech/intent recognition systems

Supported speech to text systems include:

CMU's pocketsphinx
Dan Povey's Kaldi
Mozilla's DeepSpeech 0.9
Kyoto University's Julius

Supported Languages

Catalan (ca)
- ca-es_pocketsphinx-cmu
Czech (cs)
- cs-cz_kaldi-rhasspy
German (de)
- de_deepspeech-aashishag
- de_deepspeech-jaco
- de_kaldi-zamia (default)
- de_pocketsphinx-cmu
Greek (el)
- el-gr_pocketsphinx-cmu
English (en)
Spanish (es)
- es_deepspeech-jaco
- es_kaldi-rhasspy (default)
- es-mexican_pocketsphinx-cmu
- es_pocketsphinx-cmu
French (fr)
- fr_deepspeech-jaco
- fr_kaldi-guyot (default)
- fr_kaldi-rhasspy
- fr_pocketsphinx-cmu
Hindi (hi)
- hi_pocketsphinx-cmu
Italian (it)
- it_deepspeech-jaco
- it_deepspeech-mozillaitalia (default)
- it_kaldi-rhasspy
- it_pocketsphinx-cmu
Korean (ko)
- ko-kr_kaldi-montreal
Kazakh (kz)
- kz_pocketsphinx-cmu
Dutch (nl)
- nl_kaldi-cgn (default)
- nl_kaldi-rhasspy
- nl_pocketsphinx-cmu
Polish (pl)
- pl_deepspeech-jaco (default)
- pl_julius-github
Portuguese (pt)
- pt-br_pocketsphinx-cmu
Russian (ru)
- ru_kaldi-rhasspy (default)
- ru_pocketsphinx-cmu
Swedish (sv)
- sv_kaldi-montreal
- sv_kaldi-rhasspy (default)
Vietnamese (vi)
- vi_kaldi-montreal
Mandarin (zh)
- zh-cn_pocketsphinx-cmu

Unique Features

voice2json is more than just a wrapper around open source speech to text systems!

Training produces both a speech and intent recognizer. By describing your voice commands with voice2json's templating language, you get more than just transcriptions for free.
Re-training is fast enough to be done at runtime (usually < 5s), even up to millions of possible voice commands. This means you can change referenced slot values or add/remove intents on the fly.
All of the available commands are designed to work well in Unix pipelines, typically consuming/emitting plaintext or newline-delimited JSON. Audio input/output is file-based, so you can receive audio from any source.

Commands

download-profile - Download missing files for a profile
train-profile - Generate speech/intent artifacts
transcribe-wav - Transcribe WAV file to text
- Add --open for unrestricted speech to text
transcribe-stream - Transcribe live audio stream to text
- Add --open for unrestricted speech to text
recognize-intent - Recognize intent from JSON or text
wait-wake - Listen to live audio stream for wake word
record-command - Record voice command from live audio stream
pronounce-word - Look up or guess how a word is pronounced
generate-examples - Generate random intents
record-examples - Generate and record speech examples
test-examples - Test recorded speech examples
show-documentation - Run HTTP server locally with documentation
print-profile - Print profile settings
print-downloads - Print profile file download information
print-files - Print user profile files for backup

GitHub

https://github.com/synesthesiam/voice2json

A collection of command-line tools for offline speech/intent recognition

voice2json

Supported Languages

Unique Features

Commands

GitHub

John

A unified interface for launching and installing OpenMMLab projects

Sensor of Temperature Feels Like for Home Assistant

voice2json

Supported Languages

Unique Features

Commands

GitHub

A unified interface for launching and installing OpenMMLab projects

Sensor of Temperature Feels Like for Home Assistant

You might also like...