TOOLIFY

Home AI News How to Convert Image to Text Using Python?

How to Convert Image to Text Using Python?

Image to text conversion is a might useful process for extracting text from inside images. Normally, this is an easy task for human beings because we can understand words no matter where they are written. However, we are also slow and prone to making errors.

That’s why we seek to automate mundane tasks because machines can do the same task over and over with the same precision. The troublesome part is that computers are incapable of recognizing letters and words in an image because they see images in a fundamentally different way than humans.

they see images in a fundamentally different way than humans.

That was the case until optical character recognition (OCR) technology evolved. Now, with the power of AI (specifically machine learning), it is possible for computers to read and extract characters in an image.

Today, we are going to learn how you can write a simple program to do the same procedure in Python 3.

Extracting Text from an Image with Python 3

Python 3 is one of the most powerful languages available today. It is extremely popular in the AI industry because of its massive community and pre-existing libraries and functions that aid in development of AI software.

Today, we are going to use Pytesseract and Pillow libraries for creating our image to text converter. Pytesseract is an OCR library that has functions for extracting text from an image while Pillow is a library that allows us to import images into our Python programs.

We are also going to use Google Collab to show our code as that is the easiest platform to run Python on and it does not require your own computer’s resources.

Now, let’s begin.

1. Installing Dependencies

In the program we are going to write, there are two dependencies: Tesseract and Pillow. Here’s how you can install them.

Tesseract needs to be installed on your computer first before you can use it in a Python program. In Google Colab, that is easy to do since it will simply install on a server rather than your computer. The server in question is linux based so you need to write a Linux terminal command to install it.

Google Colab doesn’t require you to open a specific terminal tab, you can simply write terminal commands in a separate code block to run them. To install tesseract, write the following code in a block:

!sudo apt install tesseract-ocr

Then run the code block for the installation to take place. This can take a while so don’t worry about it.

Now, you need to install tesseract for Python. You can use the Python install package (PIP) for this. Simply write the following command in another code block:

!pip install pytesseract

And that’s it, you are done with the installation of dependencies. But where is the installation of Pillow you might ask? Pillow is bundled with Pytesseract already so it is installed along with it.

1. Writing the Program

Now, we need certain functions for the program to work. First of all we need to import the Pytesseract library. However, after that we also need to import some other libraries those being:

Shutil: A module for high-level file operations, such as copying and moving files or entire directories.
OS: A module that provides functions for interacting with the operating system, like file manipulation, directory navigation, and environment variables.

We need Shutil and OS to be able to upload images from our device to the Colab notebook.

So, here’s what our program looks like right now:

import pytesseract

import shutil

import os

try:

from PIL import Image

except ImportError:

import Image

Simply put, we are asking the program to import the “Image” class from the Pillow library and to not stop unless there is an error.

1. Uploading the Files

Now, we need to upload our image to the program. To do that, we will use the “Files” class and the upload function.

Here’s what it looks like:

from google.colab import files

uploaded = files.upload()

Once you run this, you will see a button in the code block output which you can use to upload an image.

How to Convert Image to Text Using Python?

1. Text Extraction

Text extraction is a simple matter, we simply use the Pytesseract “image-to-string” function to extract the text from our uploaded image. The function itself takes another function as a parameter which is the “image.open()” function. This second function takes your image’s name as a parameter and opens it in the runtime.

You need to store the results of the extraction to a variable which in this case, we have labeled ‘extracted Information.’

Here’s what the final product looks like:

extractedInformation = pytesseract.image_to_string(Image.open('sample pic.png'))

The “sample pic.png” should be replaced with the name of the image you uploaded.

1. Printing the Results

To see the results of the extraction, you just need to call the print function and pass the extracted Information variable as an argument to it.

print(extractedInformation)

Since we used the following image for our example:

How to Convert Image to Text Using Python?

We got this result:

How to Convert Image to Text Using Python?

As you can see since the words are so jumbled in the original we have a messy output. However the results become better with a cleaner image.

Alternate Method

If doing all of that does not sound good to you, then you can opt for a simple approach instead. Just go online and use a Python-powered image to text converter.

They are super easy to use. All you have to do is do the following:

Go online
- Search the term image to text
- Select a tool from the search results
- Upload your image in the conspicuous input field
- Hit the confirm button
And voila! Your text has been extracted successfully.

And voila! Your text has been extracted successfully.

Conclusion

Image to text conversion in Python is quite easy due to the availability of pre-built libraries and functions. However, they also require a lot of tweaking to get the desired solution because there aren’t any one-size-fits-all solutions.

To circumvent that you can use an online image-to-text converter instead which is far easier to use and more efficient as well. If you are here just to learn, then by all means meddle with Python as much as you can because with experience you will become better.