Active Topics

 


Poll: What advanced text entry method(s) would you like to see on Sailfish?
Poll Options
What advanced text entry method(s) would you like to see on Sailfish?

Reply
Thread Tools
Posts: 27 | Thanked: 35 times | Joined on Jan 2016 @ Sweden
#271
@ferlanero: what os did you use to compile the files?
 
Posts: 105 | Thanked: 205 times | Joined on Dec 2015 @ Spain
#272
Originally Posted by spidernik84 View Post
@ferlanero: what os did you use to compile the files?
ArchLinux but still I can't write words with letter "ñ" like España
 
Feathers McGraw's Avatar
Posts: 654 | Thanked: 2,368 times | Joined on Jul 2014 @ UK
#273
Originally Posted by ferlanero View Post
ArchLinux
 
Posts: 86 | Thanked: 362 times | Joined on Dec 2007 @ Paris / France
#274
Originally Posted by ferlanero View Post
I can't write words with letter "ñ" like España
Make sure you use the real "N" key and not "Ñ" (this is a known limitation that has been discussed earlier, and I have some ideas on how to fix it).

If it still does not work, send me your language files and logs (you have to enable them in the application and they can be found in ~/.local/share/okboard). And do not expect a quick answer (I did not even start to work on the weeks old transparency issue).


BTW, last week I worked on the engine that processes swipes. Now it is not that much more accurate, but it does no more assign bad scores to expected results. Next week I will start to re-implement the prediction engine (this is the part which handle language model and learning. At the moment it is completely broken) and after I will go back to regular bug fixing.
 

The Following 14 Users Say Thank You to eber42 For This Useful Post:
Posts: 105 | Thanked: 205 times | Joined on Dec 2015 @ Spain
#275
Originally Posted by eber42 View Post
Make sure you use the real "N" key and not "Ñ" (this is a known limitation that has been discussed earlier, and I have some ideas on how to fix it).

If it still does not work, send me your language files and logs (you have to enable them in the application and they can be found in ~/.local/share/okboard). And do not expect a quick answer (I did not even start to work on the weeks old transparency issue).


BTW, last week I worked on the engine that processes swipes. Now it is not that much more accurate, but it does no more assign bad scores to expected results. Next week I will start to re-implement the prediction engine (this is the part which handle language model and learning. At the moment it is completely broken) and after I will go back to regular bug fixing.
Hi eber42! Thank you very much for your answer. Yes, if I swipe over N instead Ñ for words containig that word, it works perfect. So I'll advice it in the openrepos for Spanish OKBoard language. Thank you for all your efforts! However it'll be pretty if swipes over Ñ works. If you need testers, please, ask me.

Thank you very much!
 
Posts: 10 | Thanked: 12 times | Joined on Jun 2013
#276
Originally Posted by ferlanero View Post
The steps to do that are these:

-You need a linux environment (I'm using Archlinux, but Ubuntu or some other works too)

- You need to download the tarball first: http://git.tuxfamily.org/okboard/okb...master.tar.bz2 and uncompress it at your /home directory

- You need the dictionaries. I take it from https://github.com/titoBouzout/Dictionaries but it needs to be adjusted, so I attach the file already processed (see Spanish.dic.txt.zip on this post)

-You need the corpora files of your language (e.g. Spanish)
http://corpora2.informatik.uni-leipzig.de/download.html
http://www.cs.upc.edu/~nlp/wikicorpus/
http://opus.lingfil.uu.se/OpenSubtitles2016.php
http://www.lllf.uam.es/ESP/Corlec.html
https://tatoeba.org/spa/downloads

- You need the "aspell-es" package (in case of Spanish) instaled from the repos of your distro.

- You need "lbzip2" package installed in your system too.

-You need "rsync" installed in your system.

-You need "QT5" installed in your system.

- Now you need to create a folder somewhere and put the dictionary inside (e.g. /home/username/okboard/langs)

-If you have several corpora files, then:

Code:
cat file1 file2 file3 file4 file5 > corpus-es.txt
- Open a terminal window

- And set the two environment variables:

Code:
export CORPUS_DIR=/home/username/okboard/langs
Code:
export WORK_DIR=/home/username/okboard/langs
- You can see those variables with

Code:
echo $VARIABLE_NAME
if you're curious

- You need to compress the file (Spanish.dic.txt) you put before in /home/username/okboard/langs:

Code:
bzip2 Spanish.dic.txt
- Now should be named corpus-$LANG.txt.bz2 In our case: corpus-es.txt.bz2 because of Spanish

- There should be a single file inside.

- The next thing is to do is to move in okboard files inside the same Terminal window in our case "/home/username/okb-engine-master/". Here is the okboard's source code.

Code:
cd /home/username/okb-engine-master/
- In 'db' folder you must create a lang-es.cf file first. You can copy it from another .cf file in the same folder (e.g. copy lang-en.cf and rename it into lang-es.cf)

-And left only ASCII characteres on those files:

Code:
lbzip2 -d < corpus.txt.bz2 | clean_corpus.py | lbzip2 > new_corpus.txt.bz2
- Execute
Code:
db/build.sh es
("es" in case of Spanish)

- After this, the script create the dictionaries for OKBoard with next list of files:

add-words-fr.txt
es-predict.dict
lang-fr.cf
clusters-es.log
es-test.txt.bz2
lang-nl.cf
clusters-es.txt
es.tre
predict-es.db
corpus-es.txt.bz2
grams-es-full.csv.bz2
predict-es.ng
db.version
grams-es-learn.csv.bz2
predict-es.rpt.bz2
es-full.dict
grams-es-test.csv.bz2
predict-es.txt.bz2
es-full.tre
lang-en.cf
words-es.txt
es-learn.txt.bz2
lang-es.cf

- So, now we have the Spanish dictionary created.

After this. I don't know what to do with these files. So any help is welcome

-----------------------------------


I'm trying to make Finnish support for OKBoard, but have to ask some tips from you guys. I'm not experienced in stuff like this. Anyway here is my current check list:

1) I have Linux distribution to use

2) I've downloaded OKBoard tarball

3) Dictionaries... There's no dictionary file for Finninh at the link provided.

4) Corpora file. I first tried to use http://www.corpora.heliohost.org/download.html But file has CRC error (the 2016 version), so I ended up to get Finnish version from here instead: http://opus.lingfil.uu.se/OpenSubtitles2016.php

5) I think Finnish spellchecking doesn't use aspell, but Malaga based Voikko: http://voikko.puimula.org/ and if I'm not misunderstood, voikko is used by ispell for example. But how to get that finnish dictionary file is somehow unclear to me.

After all this is done I could try to get forward with this but still lot of work as it seems. Also, What do you think, would it be good to include some additional sources too (like more official source ( http://kielitoimistonsanakirja.fi / http://kaino.kotus.fi/sanat/nykysuomi/ ) and if multible sources, how to easilly remove duplicates?
 

The Following 2 Users Say Thank You to uggeli For This Useful Post:
Posts: 27 | Thanked: 35 times | Joined on Jan 2016 @ Sweden
#277
Originally Posted by uggeli View Post
5) I think Finnish spellchecking doesn't use aspell, but Malaga based Voikko: http://voikko.puimula.org/ and if I'm not misunderstood, voikko is used by ispell for example. But how to get that finnish dictionary file is somehow unclear to me.
After a quick look, I could not find any packaged Finnish dict either.
I did a search and came up with this file: https://packetstormsecurity.com/file...innish.gz.html

My knowledge of Finnish is limited to "Tervetuloa", "Kiitos", "Rautatientori" and some mixed insults that I'd rather not write, so I'm not the best person to judge if that file is free of spelling mistakes.
I think it's not the best dict, since it seems to contain even occasional English and Italian words, but it's a good start.

The okboard readme explains how to name the file and where to put it to bypass the aspell dict.

Good luck again!
 
Posts: 635 | Thanked: 1,535 times | Joined on Feb 2014 @ Germany
#278
Why don't you use aspell-fi?

Finnish corpora files are available at: http://corpora2.informatik.uni-leipzig.de/download.html
 
Posts: 10 | Thanked: 12 times | Joined on Jun 2013
#279
Well I'm now testdriving Fedora, still have to used to this. So I first just looked at "software center" and there was no aspell-fi. But just tested via commandline and it seems to be there, but... If I'm not wrong all spelling efforts have been put to voikko for years now. I don't know that any distro uses nothing else than voikko as default for Finnish. But then again as said, this whole thing is so new to me that might be more than possible that someone else has to do this at the end, but until that I can give this a try and study things when I have some time.
 

The Following 2 Users Say Thank You to uggeli For This Useful Post:
Posts: 25 | Thanked: 7 times | Joined on Nov 2012
#280
Is OKBoard getting cyrillic support anytime soon? I am interested in using English along with Russian
 
Reply

Tags
bettertxtentry, huntnpeck sucks, okboard, sailfish, swype


 
Forum Jump


All times are GMT. The time now is 15:37.