DeepSpeech model training

laxmikant04.yadav · July 24, 2019, 9:44am

I am working in training new deepSpeech model for German language…

I have downloaded data-sets from the official site and followed the steps mentioned on - https://www.npmjs.com/package/deepspeech , to convert mp3 into wav format that is compatible with deepSpeech training .

I am executing following command to start training -

python3 DeepSpeech.py --checkpoint_dir /root/.local/share/deepspeech/checkpoints/test_training --epochs 3 --nouse_seq_length --export_tflite --export_dir ./test/export/destination --train_files ./test/train.csv --dev_files ./test/dev.csv --test_files ./test/test.csv --lm_trie_path data/lm-test/trie --lm_binary_path data/lm-test/lm.binary

what i noticed , during the training lm.binary file is not getting updated . I think because of that i am getting below error-

Please help me in letting me understand what i am doing wrong. Thanks!!

lissyx · July 10, 2019, 11:23am

You have not properly set-up git-lfs. Please read documentation: https://github.com/mozilla/DeepSpeech/blob/master/README.md#training-your-own-model

laxmikant04.yadav · July 10, 2019, 12:34pm

Hi @lissyx,

I am not getting you properly , sorry for that.

I just executed below command , hoping to install git-lfs.

curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash

Below is the output-

–curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash
Detected operating system as Ubuntu/xenial.
Checking for curl…
Detected curl…
Checking for gpg…
Detected gpg…
Running apt-get update… done.
Installing apt-transport-https… done.
Installing /etc/apt/sources.list.d/github_git-lfs.list…done.
Importing packagecloud gpg key… done.
Running apt-get update… done.

The repository is setup! You can now install packages.

Is this right ? Or i need to do something else.

Please help!!

lissyx · July 10, 2019, 2:35pm

Make sure you re-clone after that. Or you have to manually git lfs fetch or something to get the files.

laxmikant04.yadav · July 10, 2019, 6:00pm

Hi @lissyx,

Thanks for your quick responses . I just competed the flow with one training file without any errors , hope i will not get any when i train with bulk data-set

Thanks-lot again!!

laxmikant04.yadav · July 11, 2019, 10:01am

HI @lissyx,

i am trying to use the newly trained language model using NodeJS and i am getting below error-

Error: Trie file version mismatch (4 instead of expected 3). Update your trie file.
Error running session: Not found: PruneForTargets: Some target nodes not found: initialize_state
Segmentation fault (core dumped)

Here are my configuration -

deepspeech --version
TensorFlow: v1.13.1-10-g3e0cc53
DeepSpeech: v0.5.1-0-g4b29b78

is it related to – https://github.com/mozilla/DeepSpeech/issues/2206

if yes, then is it mandatory to update to deepspeech version v0.6.0-alpha.1 ?

lissyx · July 11, 2019, 10:56am

You need to use things in sync, either all v0.5 or all v0.6

laxmikant04.yadav · July 12, 2019, 1:48pm

Hi @lissyx,

I am performing below mentioned instructions , using deepspeech version v0.5.1.

1- set-up git lfs
2- clone deepSpeech library git clone --branch v0.5.1 https://github.com/mozilla/DeepSpeech.git DeepSpeech-lib
3- install the dependencies -
pip3 install -r requirements.txt
4- install ds_ctcdecoder
pip3 install $(python3 util/taskcluster.py --decoder) , this installed ds-ctcdecoder==0.5.1
5. Download data-sets from official site.
6. convert data to a format that deepSPeech engine can understand -
bin/import_cv2.py …/data-sets/german/clips

train using below command
python3 DeepSpeech.py --epochs 10 --checkpoint_dir /root/.local/share/deepspeech/checkpoints --nouse_seq_length --export_dir ./test/export/destination --train_files ./test/train.csv --dev_files ./test/dev.csv --test_files ./test/test.csv

above command will output_graph.pb in the mentioned export dir i.e - ./test/export/destination
Test with newly exported model
python3 ./native_client/python/client.py --model ./test/export/destination/output_graph.pb --alphabet ./data/alphabet.txt --lm ./data/lm/lm.binary --trie ./data/lm/trie --audio …/Data-sets/german/clips/common_voice_de_17300571.wav

I am getting below error after step 8 , i.e trying to use newly trained model.

I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA

and then –

I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA

is it now related to my system configurations , please provide your inputs.

Thanks!!

lissyx · July 12, 2019, 3:39pm

This is not an error and is not related to your models. Those are warnings, you can ignore them.

laxmikant04.yadav · July 13, 2019, 12:32pm

Hi @lissyx,

I made a copy paste error , sorry for that . Below is the actual error message i am getting -

Error running session: Invalid argument: Tensor input_lengths:0, specified in either feed_devices or fetch_devices was not found in the Graph

While looking out on Net , i got below reason on one site -

Although the model has a Session and Graph, in some tensorflow methods, the default Session and Graph are used. To fix this I had to explicity say that I wanted to use both my Session and my Graph as the default:

but i am not getting this properly , Please let me know your inputs.

Thanks!!

lissyx · July 13, 2019, 2:46pm

That feels strange, but you are running directly client.py and you don’t share the start of the input, so we cannot check what libdeepspeech.so is actually running.

Please test properly, as documented: set-up a virtualenv and install with pip install deepspeech==0.5.1 and run inference with deepspeech rather than calling client.py directly.

laxmikant04.yadav · July 14, 2019, 10:36am

Hi @lissyx,

I tried it running with interface as well and got the same error.

It works fine with pre-trained (model). i will try with creating new virtual environment .

Thanks!!

laxmikant04.yadav · July 14, 2019, 4:11pm

Hi @lissyx,

I tried with creating new virtual environment , still facing same error.

Can it be because i have trained model with very few data-set (2-3 files of 10 sec).Currently i am trying to do a complete POC that’s why i have not trained with large data-set .Please let me know your inputs?

Thanks!!

lissyx · July 15, 2019, 7:32am

No, that’s something else.

Like …

lissyx · July 15, 2019, 7:33am

and yes @laxmikant04.yadav you shared that earlier, but since you kept sharing without proper code formatting, your python command line was unreadable to me and thus I missed that information.

laxmikant04.yadav · July 15, 2019, 7:38am

Thanks @lissyx ,

I went through your reply on post -
[FIXED] Error with master/alpha8 (unknown op: UnwrapDatasetVariant & WrapDatasetVariant)

so currently, i am training without --nouse_seq_length flag.

Those were simple steps i was making a note for myself on text file . I will keep in mind to have proper formatting on my next comments .

Thanks-alot!!

lissyx · July 15, 2019, 7:38am

You don’t need to retrain, just re-export without that flag.

laxmikant04.yadav · July 15, 2019, 8:07am

Thanks @lissyx .

It worked fine after exporting without “–nouse_seq_length” flag.

Thanks!!!

laxmikant04.yadav · July 24, 2019, 7:45am

HI @lissyx,

I am working on speech recognition with microphone , and i started with below example from deepspeech github repo -

I could see it’s trying to recognise the speech but accuracy is not coming good for me .
i am working on Ubuntu 16.04 OS on a desktop .

Currently it’s only able to recignise one word that too when spoken very loud and very clear. and failing otherwise .

Could you please suggest what else i should try or where i can look up to increase it’s accuracy.

Our expectations are that it should be able to recognise simple sentances like - “Welcome to speech recognition” . this works perfectly when i try with clean audio files.

Thanks!!!

lissyx · July 24, 2019, 8:28am

Looks like you’ve got some hint yourself. Though, you don’t document if those clean audio files are produced by you or if they are from other origin.

Also,

github.com

mozilla/DeepSpeech/blob/master/examples/mic_vad_streaming/requirements.txt#L1


deepspeech~=0.4.1
pyaudio~=0.2.11
webrtcvad~=2.0.10
halo~=0.0.18
numpy~=1.15.1

It looks like we have not updated that to 0.5.1, maybe it is worth testing if it improves, since this model was trained to be more robust to some noise.

Make sure your system is able to actually capture at mono 16kHz, resampling might get into.

It could also just be a side-effect of your mic, that captures poor quality sound. Besides improving the model, there’s hardly anything we can easily improve.