View Single Post
Posts: 102 | Thanked: 187 times | Joined on Jan 2010
#237
Originally Posted by eber42 View Post
1) But the case of words with two different capitalization is not very well handled.
...
2) so my short term suggestion is to add an option to provide a (smaller) dictionary instead of using aspell's one or to trust the input corpus to be flawless.

What do you think ?
1) It is definitely true. I saw this with the Spanish dictionary too when I did the full corpus.

2) Yes, providing an alternative dictionary is good. Maybe just keep the dict if its there? Instruct to build clean otherwise? Assuming they are flawless is not too bad either since people still write a lot of stuff which is not covered by the aspell dictionary.