Little introduction, I am a student participating in Summer of code, I am an avid gamer and I love to customize Firefox with userChrome.css, I code in Java and Haskell.
Here is a cool link for a variety of Firefox mods: https://github.com/Timvde/UserChrome-Tweaks
In my quest for the most minimal desktop interface I hit a roadblock though. There is hardly any application for speech recognition on Linux, even though there is rich applications for MacOs like Dragon dictate which can be hacked to control a browser : https://youtu.be/YalmPQEP54g
I think that speech input is about as fast as a keyboard and as easy to use as a mouse. And, there are pretty good models like
CMU Sphinx : https://github.com/cmusphinx/sphinx4
Kaldi GitHub: kaldi-asr/kaldi
And of course, Mozilla DeepSpeech on github: mozilla/DeepSpeech
It’s pretty diverse range of speech-to-text engines. Now, can we make applications take advantage of these successful models? I think Yes! Can we make a server, like Apache web server to serve transcriptions? An application will be able to send a request to the server to record audio then convert it to text and send it back to the application. Cool. Can this be done Mozilla?
Also see:
Pretty cool demo of Elite dangerous: https://youtu.be/DRVCkUN_Mq8
Signal programming in Linux : https://www.freedesktop.org/wiki/IntroductionToDBus/
I can’t wait to finish my minimal Firefox setup, with Voice control. But the cool part about implementing with Dbus is that other applications can talk to it too! GAMES, INTUITIVE UI? Idk. Possibilities are endless.