Install Tesseract OCR on Raspberry Pi

November 21, 2020
Raspberry Pi
0 Comments
7492 Views

Tesseract is an optical text recognition (OCR) engine developed by Google. Tesseract allows to recognize the text in image and supports more than 100 languages. Tesseract is an open-source project, available under the Apache License 2.0. Tesseract can be used with many programming languages through wrappers or directly from the command line.

This tutorial shows how to install Tesseract OCR on Raspberry Pi.

Connect to Raspberry Pi via SSH and execute the following commands to install Tesseract OCR:

sudo apt update
sudo apt install -y tesseract-ocr

After installation we can check Tesseract OCR version.

tesseract --version

Now we can test Tesseract OCR. First download image from the Internet using wget tool.

wget https://raw.githubusercontent.com/madmaze/pytesseract/master/tests/data/test.png

Execute the tesseract command to recognize the text in image. First argument is the name of the image. Second argument is the name of the output file which will hold recognized text. We don't need to provide the file extension (txt extension will be appended).

tesseract test.png result
cat result.txt

Results can be written to standard output with stdout argument.

tesseract test.png stdout

If we want completely remove any package with a name that starts with tesseract and anything related to it we can execute this command:

sudo apt purge -y tesseract.*

Related

Leave a Comment