Hey there,
I am trying to run the Common Voice through DeepSpeech in order to get a WER estimate for the test dataset on the reference model provided by Mozilla. I am currently getting a WER of ~34%.
python3 DeepSpeech.py --n_hidden 2048 --initialize_from_frozen_model models/output_graph.pb --notrain --test --display_step=1 --epoch 1 --test_files=/data/CV/cv-valid-test.csv --dev_files=/data/CV/cv-valid-dev.csv --train_files=/data/CV/cv-valid-train.csv
I Initializing from frozen model: models/output_graph.pb
I Test of Epoch 0 - WER: 0.344074, loss: 30.61848204136957, mean edit distance: 0.173512
I --------------------------------------------------------------------------------
I WER: 0.090909, loss: 0.131743, mean edit distance: 0.019231
I - src: "it wasn't clear to him how to spend his morning time"
I - res: "it wasnt clear to him how to spend his morning time"
I --------------------------------------------------------------------------------
I WER: 0.125000, loss: 0.132348, mean edit distance: 0.026316
I - src: "you can't talk to her like that though"
I - res: "you cant talk to her like that though"
I --------------------------------------------------------------------------------
I WER: 0.166667, loss: 0.093715, mean edit distance: 0.080000
I - src: "i don't care what you say"
I - res: "i dont care what you say "
I --------------------------------------------------------------------------------
I WER: 0.200000, loss: 0.018371, mean edit distance: 0.034483
I - src: "he doesn't have anything else"
I - res: "he doesnt have anything else"
I --------------------------------------------------------------------------------
However, I’ve noticed that most of the WER penalties are occurring from missing apostrophes in conjunction words (like don't
or can't
). I have yet to find an example of a correctly predicted apostrophe, despite the '
character being present in data/alphabet.txt
.
Are there any special precautions / flags I should add to DeepSpeech.py
in order to correctly predict or ignore apostrophes in order to decrease my mean WER?