06 Apr, 2018

Voice Recognition models in DeepSpeech and Common Voice

Teacher: Alexandre Lissy — Mozilla

DeepSpeech is an open source Speech-To-Text engine, using model trained by machine learning techniques, based on Baidu’s Deep Speech research paper.

You will learn how the model works, and how this was implemented using TensorFlow. The workshop will cover how we went from a PoC hack to a model that we try and make usable in production and how we leverage the distributed training system. We’ll explore how the inference-specific model is being built and the code around to make it run on several devices, and the tooling from TensorFlow we explored to try and speedup things.

We also present the Common Voice project, aiming at collecting open dataset for machine learning and more specifically voice-targetted machine learning.

You’ll be able to contribute to both project: how to train your own model for DeepSpeech, how to use DeepSpeech as a “blackbox”, how to hack into DeepSpeech, and how to contribute to Common Voice.

Duration: Half day.

Prerequisites: Python, shell, exposure to C++ is welcome.

#machinelearning #deepspeech #voicerecognition #tensorflow