View Single Post
Posts: 959 | Thanked: 3,427 times | Joined on Apr 2012
#678
So I have been weighing utility vs ideals, and have decided to try to make an updated version of Saera using wit.ai for parsing (and optionally speech recognition as well). The upside of this is that it will decouple the speech recognition and parsing and allow me to provide more intelligence and more functionality. The downside is that a) it will require an internet connection and b) it will send your queries to a remote server. On the one hand, I would rather not do this; but on the other, the gstreamer/Pocketsphinx setup was broken enough that most of the time it would fall back to Google Talk anyway, which is no better.
Features in Saera 2:
  • Flexible grammar - the parsing is done by fuzzy matching rather than exact fit, so even if you say something I've never thought of it should be able to make a decent guess as to what you mean.
  • Long-term and short-term memory. Short-term memory means that you can refer back to things you've already said recently (e.g. "What is the capital of France?" 'Paris.' "What's the weather like there?"). Long-term memory means that you can tell Saera facts and they will be remembered permanently (unless told to forget them).
  • A proper ability to read out emails; this is something I've found myself really wanting while driving.
  • Faster. I don't want to wait a minute to see if anything is happening, and I bet you don't either.

The development version will make use of python, gstreamer, QT, espeak, and optionally pocketsphinx. I'm hoping the reduced dependencies makes it easier to port to sailfish while still retaining compatibility with Maemo 5.

Thoughts?
 

The Following 18 Users Say Thank You to taixzo For This Useful Post: