0.5 model questions

I had some questions about the 0.5 model that weren’t answered in the readme.

  1. How many hours of data were used to train it? What proportion of this was from Common Voice?

  2. Is the CV audio data recent enough to contain the wiki sentences?

  3. Do the regular and lite models have different error rates? Are there any other limitations we should expect from using the lite model compared to the regular one?

Yes, when we landed TFLite with quantization, I could document a ~2.3% increase on the WER.

Regarding other limitations, besides that you need to use TFLite runtime, and that for now we only provide it for Android, I don’t think so.