Evaluate.py with pbmm model instead of checkpoint and wav_filename in report

safas · October 27, 2019, 8:46pm

In order to evaluate the pre-trained model currently I do:

python evaluate.py --checkpoint_dir deepspeech-0.5.0-checkpoint --test_files ../scripts/tts/vocabulary_mixed/wavs.csv --alphabet_config_path data/alphabet.txt --lm_binary_path data/lm/lm.binary --lm_trie_path data/lm/trie --report_count 1000000 --epochs 0 --test_output_file ../scripts/tts/vocabulary_mixed/wavs_report.csv --one_shot_infer ''

It would be nice if evaluate.py could:

support --model (*.pbmm) instead of --checkpoint_dir. Perhaps this would speed up inference?
in inference report include wav_filename
Why does the flags naming differ in DeepSpeech.py and deepspeech module ? Possible to make them take both options e.g. --lm|–lm_binary_path ?

or perhaps benchmark_nc.py is a better candidate and should support test.csv instead of single wav file ?

Perhaps it already does these and I’m missing something?

reuben · October 28, 2019, 10:01am

Why?

I can’t think of any reason why it would be faster.

You mean in --test_output_file? It already does that on latest master. You can try to uplift it to v0.5.1 if you need that for now.

No reason. Can you file an issue for this? I’ll see if absl flags support aliases.

I don’t think anyone has used that code in a while. We should probably remove it from the repo.

lissyx · October 28, 2019, 10:12am

Also, evaluate_tflite.py can be used / hacked for this.

safas · October 28, 2019, 11:37pm

Thank’s for the input. Made aliases for the flags, --model remains though.
See https://github.com/mozilla/DeepSpeech/pull/2473