I am having trouble with using KenLM to generate language model.
I have used Mozilla Common Voice dataset (Mongolian - 256MB)
and I have the alphabet.txt in cyrillic.
Can some one help me generate LM using KenLM?
Is there anything else needed?
I am having trouble with using KenLM to generate language model.
I have used Mozilla Common Voice dataset (Mongolian - 256MB)
and I have the alphabet.txt in cyrillic.
Can some one help me generate LM using KenLM?
Is there anything else needed?
First, it’s good practice not to cross-post your issue in multiple threads.
Have you read the documentation in data/lm
?
@lissyx
Yes. I understand.
I have read and tried.
Just few seconds ago I have tried to build with KenLM on Google Colab.
And have successfully generated words.arpa & lm.binary.
Now just needs to read more and generate TRIE