Optical Character Recognition with Tesseract
Installing Tesseract from the Stretch repo will get you LOTS of stuff. Here is a slightly older version that does an acceptable job.
1. Download and unpack the tesseract-combo-stretch package from
here. Use unzipper as the extraction tool.
2. Install the core package tesseract_3.00_i386.deb or tesseract_3.00_amd64.deb.
3. Install the English language data package tesseract-lang-eng_3.00.deb.
4. For non-English languages, do the following:
a. Go
here and locate the file
xxx.traineddata.gz where
xxx is your 3-letter language code - deu, fra, ita, kor, rus, spa, ukr, etc.
b. Download the file and extract it.
c. Copy the file xxx.traineddata to /usr/share/tessdata
d. For other languages, you will need an upgrade to Tesseract v3.04.
5. Install pic2txt_1.3.deb. The dependency is peasyscale.
6. Look for pic2txt in the Graphics menu. It accepts JPEG, PNG and TIFF images as inputs.
Update: Pic2txt v1.4 is attached below. It has an optional batch mode. Instead of selecting a single file, choose a folder of images (Copy path > Paste). The preferred format is TIFF, which is now one of the output choices in PeasyScan.
----------------------------