How can i get a high accuracy results for my fine tuning model?

  1. virtualenv -p python3 $HOME/tmp/deepspeech-alpha08/

  2. source /home/dell/tmp/deepspeech-alpha08/bin/activate

  3. git clone https://github.com/mozilla/DeepSpeech.git --branch v0.2.0-alpha.8

  4. pip3 install -r requirements.txt

  5. pip3 show tensorflow

  6. Name: tensorflow

  7. Version: 1.6.0

  8. pip3 install deepspeech==0.2.0a8

  9. python3 util/taskcluster.py --branch “v0.2.0-alpha.8” --target new_native_client/

  10. wget -O - https://github.com/mozilla/DeepSpeech/releases/download/v0.1.1/deepspeech-0.1.1-models.tar.gz | tar xvfz -

  11. mkdir fine_tuning_checkpoints

12.python3 DeepSpeech.py --n_hidden 2048 --initialize_from_frozen_model models/output_graph.pb --checkpoint_dir fine_tuning_checkpoints --epoch 3 --train_files audio_folder/audio_file_train.csv --dev_files audio_folder/audio_file_dev.csv --test_files audio_folder/audio_file_test.csv --learning_rate 0.0001 --decoder_library_path new_native_client/libctc_decoder_with_kenlm.so --alphabet_config_path data/alphabet.txt --lm_binary_path data/lm/lm.binary --lm_trie_path data/lm/trie

File “DeepSpeech.py”, line 1604, in train
session.run(init_from_frozen_model_op, feed_dict=feed_dict)
UnboundLocalError: local variable ‘feed_dict’ referenced before assignment
Solution:
7a613c5

export model:

python3 DeepSpeech.py --n_hidden 2048 --initialize_from_frozen_model models/output_graph.pb --checkpoint_dir fine_tuning_checkpoints --epoch 3 --train_files audio_folder/audio_file_train.csv --dev_files audio_folder/audio_file_dev.csv --test_files audio_folder/audio_file_test.csv --learning_rate 0.0001 --decoder_library_path new_native_client/libctc_decoder_with_kenlm.so --alphabet_config_path data/alphabet.txt --lm_binary_path data/lm/lm.binary --lm_trie_path data/lm/trie --export_dir funetune_export/

RUN model:

deepspeech --model funetune_export/output_graph.pb --alphabet data/alphabet.txt --lm data/lm/lm.binary --trie data/lm/trie --audio audio_folder/audio/1.wav

deepspeech --model funetune_export/output_graph.pb --alphabet data/alphabet.txt --lm data/lm/lm.binary --trie data/lm/trie --audio /home/dell/Music/testing6.wav

Loading model from file funetune_export/output_graph.pb
TensorFlow: v1.6.0-16-gc346f2c
DeepSpeech: v0.2.0-alpha.8-0-gcd47560
Warning: reading entire model file into memory. Transform model file into an mmapped graph to reduce heap usage.
2018-09-12 13:39:42.882493: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Loaded model in 0.198s.
Loading language model from files data/lm/lm.binary data/lm/trie
Loaded language model in 0.766s.
Running inference.
she can it gods(Original Content: credit awards)
Inference took 3.134s for 1.385s audio file.

Testing Inference Result: she can it gods
Original Content: credit awards

i fine tuning my own audio with pretrained model binaries. Majorly it gives that common words in the pretrained model. and it is not trigger my own audio related content.

what is happened sir? Really very difficult to understand this tuning process sir.
how can i get a possitive results for my funing tuning model.

how it will predict my audio technical terms or words.?

in my starting stage i run that model it is not predicting my technical term words. that the reason only i again retrain a frozen model.

Please make an effort and properly use code formatting in your message: in the current state, it’s hard to read and some useful informations might have been mangled as formatting data.

1 Like