Pre Release Data vs Latest Release Data

na200E · April 1, 2019, 10:45pm

Per Deep Speech 0.4.1 release notes (https://github.com/mozilla/DeepSpeech/releases/tag/v0.4.1), the model was trained on a pre-release snapshot of the English Common Voice training corpus. Is there way to know what was added to the common voice dataset between pre-release and latest?

kdavis · April 2, 2019, 8:02am

It is possible. However, it would require some effort on our part to do so, maybe a day or more of work.

So before we do so, why wold you want this info?