WER shoots up when retraining the pretrained model for an additional epoch on Libri

The script:

CHECK=~/scratch/checkpoint-`date +%s`
cp -r deepspeech-0.4.1-checkpoint/ $CHECK
# cd ../DeepSpeech
echo $CHECK
for i in {1..1}
do
    python3 ../DeepSpeech/DeepSpeech.py --n_hidden 2048 --epoch -1 \
            --train_files libri/librivox-train-clean-100.csv \
            --dev_files libri/librivox-dev-clean.csv \
            --test_files libri/librivox-test-clean.csv \
            --checkpoint_dir $CHECK \
            --train_batch_size 24 \
            --dev_batch_size 24 \
            --test_batch_size 48 \
            --validation_step 1 \
            --checkpoint_step 1 \
            -–learning_rate 0.0001 \
            --dropout_rate 0.15 \
            --lm_alpha 0.75 \
            --lm_beta 1.85 \
            --export_dir $CHECK/export \
            --alphabet_config_path ~/asr/models/alphabet.txt \
            --lm_binary_path ~/asr/models/lm.binary \
            --lm_trie_path ~/asr/models/trie \
            --beam_width 1024 | tee training-$i.out
done

The relevant output:

/home/bderuiter/scratch/checkpoint-1559468775
100% (595 of 595) |######################| Elapsed Time: 0:04:32 Time:  0:04:32
100% (56 of 56) |########################| Elapsed Time: 0:00:19 Time:  0:00:19
100% (54 of 54) |########################| Elapsed Time: 0:01:55 Time:  0:01:55
100% (54 of 54) |########################| Elapsed Time: 0:05:11 Time:  0:05:11
Preprocessing ['libri/librivox-train-clean-100.csv']
Preprocessing done
Preprocessing ['libri/librivox-dev-clean.csv']
Preprocessing done
I STARTING Optimization
I Training epoch 378...
I Training of Epoch 378 - loss: 150.271789
I Validating epoch 378...
I Validation of Epoch 378 - loss: 108.798860
I FINISHED Optimization - training time: 0:04:52
Preprocessing ['libri/librivox-test-clean.csv']
Preprocessing done
Computing acoustic model predictions...
Decoding predictions...
Test - WER: 0.699878, CER: 48.738426, loss: 148.851822

If epoch is set to 0, the WER is 0.08, which is about expected. Why would the WER shoot up to 0.7 when training for one more epoch?

I understand that the pretrained model was trained on more than just LibriSpeech, but the difference is incredibly large. The reason I’m asking is that I’m seeing similar increases in WER when I continue to train the model on another dataset. The WER of the non-finetuned pretrained model on this dataset is 0.11, but when I train the model on the dataset, the WER immediately jumps up to 0.4 after one epoch.

This is something we also spotted. Try to decrease learning rate.

1 Like

Setting the learning rate to 0.00001 results in a WER between 0.97 and 0.99 after one epoch. The same is true for a value of 0.000001. To verify nothing else changed, I reran the script with the original learning rate, 0.0001, resulting in a WER of 0.61.

The relevant output for lr = 0.00001:

Test - WER: 0.991358, CER: 104.168596, loss: 418.055206
--------------------------------------------------------------------------------
WER: 1.000000, CER: 5.000000, loss: 17.521641
 - src: "ay me"
 - res: ""
--------------------------------------------------------------------------------
WER: 1.000000, CER: 6.000000, loss: 23.195108
 - src: "venice"
 - res: ""
--------------------------------------------------------------------------------
WER: 1.000000, CER: 7.000000, loss: 24.702639
 - src: "a story"
 - res: ""
--------------------------------------------------------------------------------
WER: 1.000000, CER: 7.000000, loss: 24.753246
 - src: "oh emil"
 - res: ""
--------------------------------------------------------------------------------
WER: 1.000000, CER: 9.000000, loss: 25.343424
 - src: "indeed ah"
 - res: ""
--------------------------------------------------------------------------------
WER: 1.000000, CER: 9.000000, loss: 31.011301
 - src: "verse two"
 - res: ""
--------------------------------------------------------------------------------
WER: 1.000000, CER: 9.000000, loss: 34.202015
 - src: "direction"
 - res: ""
--------------------------------------------------------------------------------
WER: 1.000000, CER: 11.000000, loss: 37.468643
 - src: "again again"
 - res: ""
--------------------------------------------------------------------------------
WER: 1.000000, CER: 12.000000, loss: 38.504765
 - src: "marie sighed"
 - res: ""
--------------------------------------------------------------------------------
WER: 1.000000, CER: 13.000000, loss: 39.427250
 - src: "hedge a fence"
 - res: ""
--------------------------------------------------------------------------------
I Exporting the model...

0.000001:

Test - WER: 0.992650, CER: 99.543596, loss: 303.750732
--------------------------------------------------------------------------------
WER: 1.000000, CER: 6.000000, loss: 28.675510
 - src: "a story"
 - res: "i "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 8.000000, loss: 30.219297
 - src: "verse two"
 - res: "i "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 6.000000, loss: 31.987692
 - src: "oh emil"
 - res: "i "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 4.000000, loss: 33.258804
 - src: "ay me"
 - res: "i "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 8.000000, loss: 34.096275
 - src: "direction"
 - res: "i "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 5.000000, loss: 34.566467
 - src: "venice"
 - res: "i "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 7.000000, loss: 37.244366
 - src: "indeed ah"
 - res: "he "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 9.000000, loss: 40.380829
 - src: "poor alice"
 - res: "i "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 11.000000, loss: 44.358227
 - src: "what was that"
 - res: "he "
--------------------------------------------------------------------------------
WER: 1.000000, CER: 12.000000, loss: 44.720150
 - src: "hans stirs not"
 - res: "he "
--------------------------------------------------------------------------------
I Exporting the model...
I Models exported at /home/bderuiter/scratch/checkpoint-1559664930/export

Training for two epochs with the original learning rate also results in a WER of 0.98.