View Single Post
Posts: 102 | Thanked: 187 times | Joined on Jan 2010
#178
Originally Posted by ferlanero View Post
And other error.

When I activate OKBoard, the program automatically deletes the files (`$LANG.tre`, `predict-$LANG.ng`, `predict-$LANG.db`) in my case (`es.tre`, `predict-es.ng`, `predict-es.db`) form `~/.local/share/okboard/`

So maybe the README.md needs more support...
Yes, that instruction should be removed. But for the language resource creation steps above that I already submitted a patch which made the process smoother to follow. Recreating major parts of the documentation in single entries here is not the way to do it.
We now found out your big problem, the question could be formulated like this: "How do I create a text corpus of my language of choice?"
Answer: You collect texts in Spanish according to any of the tutorials on the subject.
Alternative answer: You download texts of the types you highlighted, wikipedia dump, blog dumps, irc log dumps, sms conversation dumps etc. You then paste all texts together and remove all non-ascii/non-latin1 characters (e g with iconv, python, perl or other commandline tool) and follow the processing instructions in the README.md file.

If you find further problems, formulate a minimal example of where you get stuck. Don't speculate - just put it in a simple question form here. There might be a term for what you want to achieve. Wait with writing a new tutorial unless you find a totally unexplored area where a simple search on the web gives you nothing to link to.
 

The Following 2 Users Say Thank You to ljo For This Useful Post: