Setting lm_alpha and lm_beta to 0 is not a suitable way to disable LM scoring. As some have already mentioned here, the clients only enable the LM if you pass the flags. As for the Python code, just pass scorer=None
to the decoder calls.
run_singleshot_clean_final_v3.sh data/recorded/po6vfbxbnyduz0k9.wav
with --lm_alpha 0.75 --lm_beta 1.85:
ุกููููููู ููููุชูููุจู ุนููููููู ูููููุฏูููููู
with --lm_alpha 0 --lm_beta 0:
ุกููููููู ุฌูููู ุฒููู ููุนููููููู ูููููุกูููู
with scorer=None:
ุกููููููู ุฌููู ฺูงุฒููู ุจูุนููููููู ฺฺูููููุขุขูููููููู
Those are right-to-left, however they are special characters which are not readable anyway
I just need to test my model without any language model, and not restricting to any bag of words. The second option was not satisfying for me regarding the generated results and seems not discarding the LM/trie. The third option was very satisfying, but it is very slow.
Thank you.
Great, you confirmed what I saw in my testsโ results.
I am very satisfied with the results of scorer=None
. But, it takes a long time to finish decoding. Instead of 00:30 when using scorer, 12:30 hours were needed using scorer=None
on my test data. I am using the default --beam_width value.
Any help in this would be much appreciated.
Thank you @reuben
This is just one result. Having different results depending on the values of alpha and beta is expected. So far, it seems you imply that over subsequent runs with 0.0
for both values, you get different decoding. This is what Iโm curious about.
The LM and trie have a play in the speed of the decoding, thatโs expected. Try reducing the beam width.
with --lm_alpha 0 --lm_beta 0, using one LM:
ุกููููููู ุฌูููู ุฒููู ููุนููููููู ูููููุกูููู
with --lm_alpha 0 --lm_beta 0, using another LM:
ุนููููููู ุฌูู ูุนููู ุนููููููู ฺฺูููููุขุขูููููููู
The performance decrease is a direct consequence from disabling LM scoring. Without a LM, the model will explore every beam it can create (within the beam width limit), rather than ignoring beams that lead to out-of-vocabulary words (since in that case it has no constrained vocabulary). If at all possible, you should try to create a LM that matches your use case. If thatโs not doable, you can reduce the beam_width as @lissyx has mentioned, and also use cutoff_prob/cutoff_top_n to trade performance for accuracy. Iโd start by setting cutoff_prob=0.99
and seeing what that gets you.