Tesseract is an optical text recognition (OCR) engine developed by Google. Tesseract allows to recognize the text in image and supports more than 100 languages. Tesseract is an open-source project, available under the Apache License 2.0. Tesseract can be used with many programming languages through wrappers or directly from the command line.
This tutorial shows how to install Tesseract OCR on Raspberry Pi.
Connect to Raspberry Pi via SSH and execute the following commands to install Tesseract OCR:
sudo apt update
sudo apt install -y tesseract-ocr
After installation we can check Tesseract OCR version.
tesseract --version
Now we can test Tesseract OCR. First download image from the Internet using wget
tool.
wget https://raw.githubusercontent.com/madmaze/pytesseract/master/tests/data/test.png
Execute the tesseract
command to recognize the text in image. First argument is the name of the image. Second argument is the name of the output file which will hold recognized text. We don't need to provide the file extension (txt
extension will be appended).
tesseract test.png result
cat result.txt
Results can be written to standard output with stdout
argument.
tesseract test.png stdout
If we want completely remove any package with a name that starts with tesseract
and anything related to it we can execute this command:
sudo apt purge -y tesseract.*
Leave a Comment
Cancel reply