I had some questions about the pre-trained model for 0.4.1.
-
How many hours of data in total were used to train the pre-trained model?
-
What are the proportions of each speech corpus used? i.e. is it mainly LibriSpeech, Common Voice or an even mix of all of them?
-
It says that the model is optimized for American English but that it uses the English Common Voice corpus, so presumably this isn’t filtered first and thus contains all English accents?