![]() |
2012-06-21
, 17:10
|
Posts: 3,328 |
Thanked: 4,476 times |
Joined on May 2011
@ Poland
|
#252
|
One thing that I hope for is runtime model/dictionary switching, to allow full dictation when sending e.g. a text. Hopefully I can use some gstreamer trickery (like using a filesink at the end of the pipeline), and re-try to understand what was said (with a larger dictionary/model) if the score is low...I hope there is some way to extract the score, if not I may have to dive into the plugin code and try to find a way to extract it.
![]() |
2012-06-21
, 17:14
|
Posts: 1,523 |
Thanked: 1,997 times |
Joined on Jul 2011
@ not your mom's FOSS basement
|
#253
|
that german translation is absolutely correct except two minor spelling errors. It should be "Wusstest Du, dass das N900 Mac OSX verwenden kann?" instead of "Wußtest Du, das das N900 Mac OSX verwenden kann?" but this does not influence espeak's pronunciation.
The Following 3 Users Say Thank You to don_falcone For This Useful Post: | ||
![]() |
2012-06-21
, 17:18
|
Posts: 959 |
Thanked: 3,427 times |
Joined on Apr 2012
|
#254
|
Hi taixzo
Originally posted by taixzo:
I am not sure whether I made myself clear, the patched plugin already delivers the score. In order to be compatible it sends an additional message (called 'result_score') right before it sends the 'result' message. So after receiving 'result_score' the 'result' message can be omitted. If you want I can send you that plugin.
My app always aims to keep the current dictionary/language model as small as possible (for a better recognition accuracy) and switches the dict/lm according to its internal context. For example: imagine there is a small command-set which allows to launch a new app. After the voice-control-app recognized the command 'launch' it switches its context to 'I have to launch something', loads a new appropriate dictionary which only contains all the names of the programs that might be launched etc.
Perhaps this also an approach for your saera.
The Following User Says Thank You to taixzo For This Useful Post: | ||
![]() |
2012-06-21
, 17:21
|
Posts: 959 |
Thanked: 3,427 times |
Joined on Apr 2012
|
#255
|
Thanks. And you are somewhat right on the second part ('das das' is wrong, there should be a sharp 's' as the first one, i.e. 'daß das').
But i'm afraid i have to clarify i'm an evangelist of the true (literally old school) German language rules, and therefore i strongly detest the silly, unwanted (by most of the population) spelling reforms forced upon us by the Government in 1996 - so, f.e. no double 's' etc. for me, but 'ß'.
The Following User Says Thank You to taixzo For This Useful Post: | ||
![]() |
2012-06-21
, 17:25
|
Posts: 7 |
Thanked: 17 times |
Joined on Jun 2012
|
#256
|
i strongly detest the silly, unwanted (by most of the population) spelling reforms
![]() |
2012-06-21
, 17:51
|
Posts: 1,523 |
Thanked: 1,997 times |
Joined on Jul 2011
@ not your mom's FOSS basement
|
#257
|
![]() |
2012-06-21
, 17:52
|
Posts: 7 |
Thanked: 17 times |
Joined on Jun 2012
|
#258
|
class Saera: def __init__(self): self.result_score = False
def init_gst(self): """Initialize the speech components""" self.pipeline = gst.parse_launch('pulsesrc ! audioconvert ! audioresample ' + '! vader name=vad auto-threshold=true ' + '! pocketsphinx name=asr ! fakesink') asr = self.pipeline.get_by_name('asr') asr.connect('partial_result', self.asr_partial_result) asr.connect('result', self.asr_result) asr.connect('result_score', self.asr_result_score) asr.set_property('configured', True)
def asr_result_score(self, asr, text, score): """Forward result signals on the bus to the main thread.""" struct = gst.Structure('result_score') struct.set_value('hyp', text) struct.set_value('score', score) asr.post_message(gst.message_new_application(asr, struct))
def application_message(self, bus, msg): """Receive application messages from the bus.""" msgtype = msg.structure.get_name() if msgtype == 'partial_result': self.partial_result(msg.structure['hyp'], msg.structure['uttid']) elif msgtype == 'result_score': self.result_score = True self.final_result_score(msg.structure['hyp'], msg.structure['score']) # self.pipeline.set_state(gst.STATE_PAUSED) elif msgtype == 'result' and self.result_score == False: self.final_result(msg.structure['hyp'], msg.structure['uttid'])
def final_result_score(self, hyp, score): """Insert the final result.""" # All this stuff appears as one single action print "Final Result: ", hyp, " score: ", score if int(score) > -18500000: self.run_saera(None, "speech-event", hyp)
The Following 8 Users Say Thank You to myra For This Useful Post: | ||
![]() |
2012-06-23
, 20:44
|
|
Posts: 5,028 |
Thanked: 8,613 times |
Joined on Mar 2011
|
#259
|
![]() |
2012-06-24
, 01:17
|
Posts: 959 |
Thanked: 3,427 times |
Joined on Apr 2012
|
#260
|
taixzo, it's absolutely wonderful and amazing, what this project evolved into in just few days time. I'm absolutely sure, that You're one of favorites for Coding Competition with Saera, so please, don't forget to apply there.
As for issues with uploading permissions, we're working hard with our technical contact, in order to fix it. Of course, I could also lend You my garage account with upload permissions, but I think it's quite pointless - we need whole procedure working smoothly.
As for program itself - again, what amazes me most, is that it's not simply "speak recognition" program, but first attempt to bring AI to our N900. Sure, I also don't expect it to pass Turing test soon(not that Turing test is good measurement of intelligence, anyway). I really hope, that being a basic - and developed - AI won't disappear from scope of Saera, for the sake of functionality.
BTW, what's the current status of possibility to change her name? For some reasons too long to explain here, it would be very useful in my use case
Also, I'm quite new to this whole digitized speech thing - even if I do, lets say, polish corpus.txt, how to ensure that text written in correct polish will be pronounced correctly?
I know You're busy guy her,e so I don't ask for (re)writing tutorial - maybe some link for documentation? Or things already present in package are everything I need?
/Estel
espeak_cmdline = "espeak -vCC+f2"
![]() |
Tags |
saera, speech-to-text |
|
Originally posted by taixzo:
Perhaps this may also be an approach for your saera.
Last edited by myra; 2012-06-21 at 17:13.