Hi, I am having a small problem understanding the process of creating the language model. I pass a big corpus of text and build a language model that learns the probabilities of the n-grams (I do not know if by using order 5 I use 5-grams or from unigrams to 5-grams)
# Build pruned LM.
lm_path = '/tmp/lm.arpa'
!lmplz --order 5 \
--temp_prefix /tmp/ \
--memory 50% \
--text {data_lower} \
--arpa {lm_path} \
--prune 0 0 0 1
I build the language model
!build_binary -a 255 \
-q 8 \
trie \
{lm_path} \
{binary_path}
Now I have the model in binary format to make it faster.
./generate_trie ../data/alphabet.txt /tmp/lm.binary /tmp/trie
But what is the trie used for? I do not really understand that, would Deepspeech work without it?
Thanks a lot for your help!