maemo.org - Talk

maemo.org - Talk (https://talk.maemo.org/index.php)
-   Maemo 5 / Fremantle (https://talk.maemo.org/forumdisplay.php?f=40)
-   -   Poll - do you want voice dialing on the N900 (https://talk.maemo.org/showthread.php?t=34462)

joelteixeira 2010-11-02 12:02

Re: Poll - do you want voice dialing on the N900
 
+1 for "I don't care about voice dialing"

Wikiwide 2010-11-02 12:03

Re: Poll - do you want voice dialing on the N900
 
Quote:

Originally Posted by WereCatf (Post 860814)
It's one thing to match pre-recorded samples to microphone input, and an entirely different matter to do complete speech-to-text recognition. Try for example Google's own recognition system: even with all the processing power their servers possess the recognition system can't achieve more than about 70% accuracy and that is when the speaker speaks very, very clear. If the speaker doesn't speak all that clearly, if there is any kind of background noise, if the person speaks some dialect, or if (s)he has some sort of an accent to the speech the correctness of recognition drops sharply.

Then there's the issue of N900 being a small device with limited microphone capabilities: there is not enough processing power to do accurate recognition, and the microphone would receive sufficiently clear input only when spoken very near to it.

Google has to recognize any voice of any human. Personal speech-to-text recognition could be trained on pre-recorded samples of one human.

It shouldn't be more difficult to recognize speech than to recognize text, at least in some simpler languages (with strict relations between characters and sounds).

A better microphone could be connected through the jack, couldn't it?
I have at least two plug-in microphones, though I rarely use a microphone at all, on any device.

The most serious issue would be battery life. But for in-car hands-free experience you can plug N900 to the car charger.

Imagine: N900 in pocket, wired microphone clipped to the collar, some wired headset on the ear (not blocking outside sounds!), small solar panel on the bag; walking/bicycling in an unknown city. No need to look at the touchscreen to receive a call on the go or to read/send emails or to find out directions to your destination. Though for such a futuristic use case you would have to be in a city with free Wi-Fi, well mapped and sunny.

gregoranderson 2010-11-02 12:08

Re: Poll - do you want voice dialing on the N900
 
Dead simple - "just port" from S60 *chuckles*

http://developer.symbian.org/main/do...v_voiceui.html

WereCatf 2010-11-02 13:06

Re: Poll - do you want voice dialing on the N900
 
Quote:

Originally Posted by Wikiwide (Post 860838)
Google has to recognize any voice of any human. Personal speech-to-text recognition could be trained on pre-recorded samples of one human.

Indeed, it could be trained. But then there is the problem that you'd need a lot of training to reach 80% accuracy or more, and at that point it'd start using quite a lot of memory.

For example, there is this Dragon Naturally Speaking application for PCs. It can be used to dictate free speech to text in for example Word, but even with training it never reaches 100% accuracy. After enough training it reaches sufficient accuracy for most people, I suppose, but it starts taking several hundreds of megabytes of memory.

As such I doubt it would be feasible on a device as limited as N900. One way to go around the memory and performance hits would be to do the recognition on a server and just stream the microphone input there, but that'd create some lag between the input and output and it still probably wouldn't be feasible over 3G.

Wikiwide 2010-11-03 02:45

Re: Poll - do you want voice dialing on the N900
 
Quote:

Originally Posted by WereCatf (Post 860905)
Indeed, it could be trained. But then there is the problem that you'd need a lot of training to reach 80% accuracy or more, and at that point it'd start using quite a lot of memory.

For example, there is this Dragon Naturally Speaking application for PCs. It can be used to dictate free speech to text in for example Word, but even with training it never reaches 100% accuracy. After enough training it reaches sufficient accuracy for most people, I suppose, but it starts taking several hundreds of megabytes of memory.

As such I doubt it would be feasible on a device as limited as N900. One way to go around the memory and performance hits would be to do the recognition on a server and just stream the microphone input there, but that'd create some lag between the input and output and it still probably wouldn't be feasible over 3G.

Quick reply...
No server, please!

I hope there is a way during training of the program to store the training not as high-quality audio, but as light-weight patterns. Repeat the same word ten/twenty times with different intonations, and let it store the most basic acoustic pattern. And when it sees that a word pronounced by you fits several patterns, let it ask for additional samples of these words to make the patterns more exact and not to stumble again.
And, several hundreds of megabytes isn't a lot for device with 32 GB, though I would really prefer to use less space.

Ideally, there would be one pattern for each letter, and nothing else.

Esperanto has exactly one sound for exactly one letter, if I'm not mistaken.

Esperanto speech recognition might be reasonably more accurate than English even before training, with default patterns for each letter. Though I haven't seen an Esperanto-based simple speech-recognition software yet.

And it would solve one more problem: the program wouldn't react to human conversation unless people would speak Esperanto, and in this case people would most likely remember that the computer speaks Esperanto, too.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Interesting links here:

http://www.freepatentsonline.com/EP1217609.html
http://www.freepatentsonline.com/y2002/0198712.html
http://www.freepatentsonline.com/y2002/0198715.html
http://www.hpl.hp.com/techreports/2001/HPL-2001-182.pdf
HEWLETT PACKARD, year 2001-2002
http://www.bartneck.de/publications/...jRoMan2009.pdf
http://www.bartneck.de/publications/...ila/index.html
http://www.bartneck.de/publications/...aluationROILA/

It seems that Hewlett Packard's CPL turned out to be difficult for humans to learn...

So now from Netherlands yet another artificial language comes: trying to balance machine recognition and human learning.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
I personally would prefer Esperanto: it has better vocabulary and is designed by humans for humans.

Let ROILA be used for machine-to-machine interaction.

rickysio 2010-11-03 03:25

Re: Poll - do you want voice dialing on the N900
 
Why is there no option for No, I don't need voice dialing?

droll 2010-11-03 03:27

Re: Poll - do you want voice dialing on the N900
 
how about matching pre recorded samples to syllables or other basic building blocks of a word? that would make much more sense :)

so hello would be made up of 2 samples. one for hell and one for o.
sure there would be different samples for the same syllable since hell may not always sound the same depending on what comes after or before it. but i'm sure a system could be worked out.

slender 2010-11-03 07:04

Re: Poll - do you want voice dialing on the N900
 
@droll
You do realize that words have different intonation and e.g.
Hell
Hello
Have quite different.

I would recommend that you all learn finnish. Itīs pretty much spelled as itīs written :)

"Finnish uses just 38 spellings for its 38 sounds. Other European languages have an average of 50 graphemes.

The Finnish orthography exemplifies alphabetic perfection. English spelling does the opposite. It uses 185 spellings for its 43― sounds, and is the most irregular and hardest-to-master European writing system."
http://englishspellingproblems.blogs...rom-other.html
http://en.wikipedia.org/wiki/Finnish...ge#Orthography

But well for some reason I do not believe that learning finnish is rational thing to do :)

paulkoan 2010-11-03 07:22

Re: Poll - do you want voice dialing on the N900
 
Quote:

Originally Posted by slender (Post 861793)
I would recommend that you all learn finnish. Itīs pretty much spelled as itīs written :)

But well for some reason I do not believe that learning finnish is rational thing to do :)

I think you should stick to your guns and encourage Finnish as the primary language for maemo, weird noun declensions aside. You can teach us all.

When should I return here for the first lesson? I think a speedy approach would be best so as to ease the transition. Do you think we can get Finnish covered in two weeks?

ZogG 2010-11-03 07:54

Re: Poll - do you want voice dialing on the N900
 
yes - but only for headsets


All times are GMT. The time now is 12:06.

vBulletin® Version 3.8.8