Hello. I learned a lot from my previous thread here thread and some lively discussions on IRC. The full featured model currently wouldn’t make sense on a small consumer computer like e.g. the Raspberry Pi. But a dramatically reduced model with 20 or 30 words would most likely be feasible. Given the fact that the speech-driven technology most likely runs on very small hardware, I think we should find a way to be able to do some limited STT even on those devices. The alternative is to use some cloud service to do the STT - with all disadvantages to ones personal privacy and the fact that one might not be able to turn the lights on because the Internet is down.
While it may be possible to run a very limited model on a Raspberry Pi, it requires the training of the specified words into a “personal” model that can then be used on the hardware.
At this time - this requires the download of the common voice database, the installation of all the tools including tensorflow and at least some knowledge on how to get everything running. Frankly - too much for an average developer who wants to integrate deepspeech into his or her project.
This is why I am calling for an online environment in which users upload their word list and are able to download the trained model. But this is nothing we can do without funding - at least for the server space. I am sure having individual limited models available to make deep speech feasible even on small platforms would boost the overall project and truly generate technology that can be used by developers in the field.
Mozilla has in the past generated dedicated spaces for specific tasks. This is not different. Deep speech deserves to be made available on all platforms and for all purposes - including home brew IoT services that now need to rely on network streaming services.