View Single Post
Posts: 1,096 | Thanked: 760 times | Joined on Dec 2008
#7
Originally Posted by dwould View Post
so i was playing today and I compiled tesseract for n900. i also compiled ImageMagic so i could convert from jpg to tif for tesseract.

providing your picture is black text on white backround it seems to work pretty well.

i'm wondering how hard it would be to write a 'sharing' plugin for the photo manager, which will just call a script to process the image through convert and tesseract and spit out the text.

if i get the time I'll play more. might be cool if someone with a setup for packaging would consider uploading tesseract and magemagic to extras-devel....
I was gonna do the same thing, but did not get around to it yet.
I also use unpaper when automatically processing scans on server.

it is a small program but helps a good bit with tesseract ocr

i tie together some imagemagick->unpaper->tesseract with pretty great results and insert the raw text output into db as metadata for searching docs on our server.

works pretty well.
 

The Following User Says Thank You to quipper8 For This Useful Post: