Open Source Echo

Just a heads up that Chris Lord from the Vaani team reimplemented the Amazon Echo using open source libraries and node.js.

His project is compatible with the skills API and as such can be used as a drop-in replacement of the Echo. He wrote in details about his project at http://chrislord.net/index.php/2016/06/01/open-source-speech-recognition/ and the code is available from https://gitlab.com/Cwiiis/ferris

His project works offline (except of course if a skill needs an internet connection).

If Echo skills are so easily to reimplement, why not add them to Project Link and benefit from a free and rich ecosystem from day 1?

Any thoughts?

What I really like about Cwiiis’s project is that it shows what we can expect from pocketsphinx for the offline voice recognition when we feed it a limited grammar. That is very encouraging.

However I’m not sure how that would give us access to the Echo ecosystem directly. People writing skills would still have to register them with Link (how?), and they run these services on any cloud platform of their choice or AWS lambda, which is not ideal from a privacy point of view. Even if the skill acts as an intermediary to present data from eg. a weather service, I would rather have the voice + the code talking to the weather service on the foxbox instead of having just the voice on the box, and everything else in the cloud. Am I missing something?

So what about running the skills in the foxbox service workers?
We discussed that in the early stage of the project where service workers would run in the box. We could have the code for the skills to run in workers to which we can post messages when we receive a command.
This would force the skills source code to be open source and hackable by default. Think of it as add-ons for project Link. Doing so encourages an ecosystem and a community of developers to add new features.

But then of course we need to solve the registration issue, but I can’t really see any difficulties here.

So that’s the other major difference between Ferris and Echo that I should probably stick in the README - this runs skills locally, there’s no cloud involvement. Of course, it means the skill needs to work outside of AWS (or use some kind of AWS shim script), but most of the 3rd party skills I’ve come across so far don’t actually require AWS at all.

You’re right in that it doesn’t provide access to the Echo ecosystem directly, but it does provide (possibly) a quick way for people with Echo skills to deploy elsewhere with little to no work.

Running the skills as js workers is the way to go, yes! Which means that they will be setup by a simple webapp provided by the skill’s author.
We need to have a way to trigger them, likely with some “skill adapter” that will listen to the voice input and trigger notifications when it detects a skill.