Hi, I was reading differents topics about drop space in output, now i want to train a model without a language model.
There’s no way to do it without modifying the code. You can replace the decode_with_lm
call with tf.nn.ctc_beam_search_decoder
and then WER reports won’t use the language model when decoding.
Did I get it right that the LM is only used for decoding after the training? Because when I included my own lm and had a typo in the path to the lm.binary file it finished training (all epochs) and I got the error afterwards. I thought the lm is also included in training somehow. That was wrong?
Thanks already
That is correct, the LM is only used for the test epoch, it does not influence training.
I can’t find the decode_with_lm
call --> is there an easy way for the newest version 0.5.0 as well not using the language model?
If you don’t specify the language model arguments on the command-line it won’t use the language model.
Actually, it’ll just use the default one in data/lm. I think we don’t have an easy way to disable that, you have to comment out the loading of the Scorer in evaluate.py.
Are you sure?
I trained a non-English model, and tested it without passing lm parameters, and it gives output of my language without any problem. I think this for sure means the language model is disabled because otherwise non-English alphabets/words should be removed from the results. right?
try to remove/rename data/lm folder and test, please!
evaluate.py unconditionally loads a LM/trie for scoring, if you don’t pass the flags it’ll use the default paths in data/lm. This is pretty easy to test, you already suggested an effective way to do it