Hi,
Is it possible to use a letter based language model (i.e. letter 5-grams for example) with deepspeech?
I tried to train a letter based language model with KenLM toolkit, but did not suceed. Then I trained it with sri-lm (worked), but when I try to create the trie it fails, both if I create first the binary LM or not:
root@b9ba16c8d6a1:/DeepSpeech# /DeepSpeech/native_client/kenlm/build/bin/build_binary lm.arpa lm.binary
Reading lm.arpa
----5—10—15—20—25—30—35—40—45—50—55—60—65—70—75—80—85—90—95–100
The ARPA file is missing . Substituting log10 probability -100.
SUCCESS
/DeepSpeech/native_client/generate_trie alphabet.txt lm.binary trie
Segmentation fault (core dumped)
/DeepSpeech/native_client/generate_trie alphabet.txt lm.arpa trie
Loading the LM will be faster if you build a binary file.
Reading lm.arpa
----5—10—15—20—25—30—35—40—45—50—55—60—65—70—75—80—85—90—95–100
The ARPA file is missing . Substituting log10 probability -100.
Segmentation fault (core dumped)