Hello Community.
I have trained an arabic model, and managed to get WER or around 0.35. My data set is still small (~40 hours) and I’m working on collecting more data and data augmentation.
Meanwhile, I see some strange output text.
- Two correct words having no interspace
- Very weird letters that look like nothing, rubbish!
As far as I understand, the language model is used for beam search and defines the output text. Why is the output not restricted to vocabulary from the LM? Is there a switch for that?