Poll: Do you think we should expect voice dialing out of the box on the N900, or can Nokia expect the maem
Poll Options
Do you think we should expect voice dialing out of the box on the N900, or can Nokia expect the maem

Reply
Thread Tools
Posts: 63 | Thanked: 29 times | Joined on May 2010
#151
+1 for "I don't care about voice dialing"
 
Posts: 1,994 | Thanked: 3,342 times | Joined on Jun 2010 @ N900: Battery low. N950: torx 4 re-used once and fine; SIM port torn apart
#152
Originally Posted by WereCatf View Post
It's one thing to match pre-recorded samples to microphone input, and an entirely different matter to do complete speech-to-text recognition. Try for example Google's own recognition system: even with all the processing power their servers possess the recognition system can't achieve more than about 70% accuracy and that is when the speaker speaks very, very clear. If the speaker doesn't speak all that clearly, if there is any kind of background noise, if the person speaks some dialect, or if (s)he has some sort of an accent to the speech the correctness of recognition drops sharply.

Then there's the issue of N900 being a small device with limited microphone capabilities: there is not enough processing power to do accurate recognition, and the microphone would receive sufficiently clear input only when spoken very near to it.
Google has to recognize any voice of any human. Personal speech-to-text recognition could be trained on pre-recorded samples of one human.

It shouldn't be more difficult to recognize speech than to recognize text, at least in some simpler languages (with strict relations between characters and sounds).

A better microphone could be connected through the jack, couldn't it?
I have at least two plug-in microphones, though I rarely use a microphone at all, on any device.

The most serious issue would be battery life. But for in-car hands-free experience you can plug N900 to the car charger.

Imagine: N900 in pocket, wired microphone clipped to the collar, some wired headset on the ear (not blocking outside sounds!), small solar panel on the bag; walking/bicycling in an unknown city. No need to look at the touchscreen to receive a call on the go or to read/send emails or to find out directions to your destination. Though for such a futuristic use case you would have to be in a city with free Wi-Fi, well mapped and sunny.
 
Posts: 244 | Thanked: 354 times | Joined on Jul 2010 @ Scotland
#153
Dead simple - "just port" from S60 *chuckles*

http://developer.symbian.org/main/do...v_voiceui.html
 

The Following User Says Thank You to gregoranderson For This Useful Post:
WereCatf's Avatar
Posts: 255 | Thanked: 160 times | Joined on Oct 2010 @ Finland
#154
Originally Posted by Wikiwide View Post
Google has to recognize any voice of any human. Personal speech-to-text recognition could be trained on pre-recorded samples of one human.
Indeed, it could be trained. But then there is the problem that you'd need a lot of training to reach 80% accuracy or more, and at that point it'd start using quite a lot of memory.

For example, there is this Dragon Naturally Speaking application for PCs. It can be used to dictate free speech to text in for example Word, but even with training it never reaches 100% accuracy. After enough training it reaches sufficient accuracy for most people, I suppose, but it starts taking several hundreds of megabytes of memory.

As such I doubt it would be feasible on a device as limited as N900. One way to go around the memory and performance hits would be to do the recognition on a server and just stream the microphone input there, but that'd create some lag between the input and output and it still probably wouldn't be feasible over 3G.
 

The Following User Says Thank You to WereCatf For This Useful Post:
Posts: 1,994 | Thanked: 3,342 times | Joined on Jun 2010 @ N900: Battery low. N950: torx 4 re-used once and fine; SIM port torn apart
#155
Originally Posted by WereCatf View Post
Indeed, it could be trained. But then there is the problem that you'd need a lot of training to reach 80% accuracy or more, and at that point it'd start using quite a lot of memory.

For example, there is this Dragon Naturally Speaking application for PCs. It can be used to dictate free speech to text in for example Word, but even with training it never reaches 100% accuracy. After enough training it reaches sufficient accuracy for most people, I suppose, but it starts taking several hundreds of megabytes of memory.

As such I doubt it would be feasible on a device as limited as N900. One way to go around the memory and performance hits would be to do the recognition on a server and just stream the microphone input there, but that'd create some lag between the input and output and it still probably wouldn't be feasible over 3G.
Quick reply...
No server, please!

I hope there is a way during training of the program to store the training not as high-quality audio, but as light-weight patterns. Repeat the same word ten/twenty times with different intonations, and let it store the most basic acoustic pattern. And when it sees that a word pronounced by you fits several patterns, let it ask for additional samples of these words to make the patterns more exact and not to stumble again.
And, several hundreds of megabytes isn't a lot for device with 32 GB, though I would really prefer to use less space.

Ideally, there would be one pattern for each letter, and nothing else.

Esperanto has exactly one sound for exactly one letter, if I'm not mistaken.

Esperanto speech recognition might be reasonably more accurate than English even before training, with default patterns for each letter. Though I haven't seen an Esperanto-based simple speech-recognition software yet.

And it would solve one more problem: the program wouldn't react to human conversation unless people would speak Esperanto, and in this case people would most likely remember that the computer speaks Esperanto, too.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Interesting links here:

http://www.freepatentsonline.com/EP1217609.html
http://www.freepatentsonline.com/y2002/0198712.html
http://www.freepatentsonline.com/y2002/0198715.html
http://www.hpl.hp.com/techreports/2001/HPL-2001-182.pdf
HEWLETT PACKARD, year 2001-2002
http://www.bartneck.de/publications/...jRoMan2009.pdf
http://www.bartneck.de/publications/...ila/index.html
http://www.bartneck.de/publications/...aluationROILA/

It seems that Hewlett Packard's CPL turned out to be difficult for humans to learn...

So now from Netherlands yet another artificial language comes: trying to balance machine recognition and human learning.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
I personally would prefer Esperanto: it has better vocabulary and is designed by humans for humans.

Let ROILA be used for machine-to-machine interaction.
 
Posts: 395 | Thanked: 165 times | Joined on May 2010 @ TMO
#156
Why is there no option for No, I don't need voice dialing?
 
Posts: 958 | Thanked: 483 times | Joined on May 2010
#157
how about matching pre recorded samples to syllables or other basic building blocks of a word? that would make much more sense

so hello would be made up of 2 samples. one for hell and one for o.
sure there would be different samples for the same syllable since hell may not always sound the same depending on what comes after or before it. but i'm sure a system could be worked out.
 

The Following User Says Thank You to droll For This Useful Post:
Posts: 2,829 | Thanked: 1,459 times | Joined on Dec 2009 @ Finland
#158
@droll
You do realize that words have different intonation and e.g.
Hell
Hello
Have quite different.

I would recommend that you all learn finnish. Itīs pretty much spelled as itīs written

"Finnish uses just 38 spellings for its 38 sounds. Other European languages have an average of 50 graphemes.

The Finnish orthography exemplifies alphabetic perfection. English spelling does the opposite. It uses 185 spellings for its 43― sounds, and is the most irregular and hardest-to-master European writing system."
http://englishspellingproblems.blogs...rom-other.html
http://en.wikipedia.org/wiki/Finnish...ge#Orthography

But well for some reason I do not believe that learning finnish is rational thing to do
 

The Following User Says Thank You to slender For This Useful Post:
Posts: 422 | Thanked: 244 times | Joined on Feb 2008
#159
Originally Posted by slender View Post
I would recommend that you all learn finnish. Itīs pretty much spelled as itīs written

But well for some reason I do not believe that learning finnish is rational thing to do
I think you should stick to your guns and encourage Finnish as the primary language for maemo, weird noun declensions aside. You can teach us all.

When should I return here for the first lesson? I think a speedy approach would be best so as to ease the transition. Do you think we can get Finnish covered in two weeks?
 

The Following 2 Users Say Thank You to paulkoan For This Useful Post:
ZogG's Avatar
Posts: 1,389 | Thanked: 1,857 times | Joined on Feb 2010 @ Israel
#160
yes - but only for headsets
 
Reply


 
Forum Jump


All times are GMT. The time now is 15:44.