Is anyone using Deep Speech in an application and what results are you getting, compared to, say, Google voice recognition on regular audio captured from a smartphone?
There is a blog post proclaiming 5.6% word error rate, which sounds great, but digging deeper it appears the result was biased by testing data leaking into the training set. Additionaly, the latest model (0.3) has a lower WER (11.2% on LibriSpeech clean) due to optimizations (see Deepspeech accuracy decreasing?).
Also, I’ve not been able to find any video demos. So far the only thing close to a demo I have come across is this video: