Optical Character Reading

Origin

I always get annoyed when someone gives me an image and wants me to type it out because my typing speed is around 70 - 80 WPM. I was like let me upload it.. That works but to an extent and takes time. Also, websites are a certain amount of submissions are allowed.

Objective

Take in a file, originally it will just be one type and spread out to other files, and print out the text that is on that file. It is trivial and code is most likely out there for it. More file types and building into an application are the main focus.

Disclaimer

I do not claim this to be my code as I based it solely on another repository.

Step 1 - Setup

First, it requires the installation of packages.

$ pip install numpy
$ pip install PIL
$ pip install opencv-python
$ pip install pytesseract

Step 2 - The code

I feel the code is self-explanatory based on the number of comments and the minimal size of the program.

Image to Text Function

def get_string(img_path):
    # Read image with opencv
    img = cv2.imread(img_path)

    # Convert to gray
    img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    # Apply dilation and erosion to remove some noise
    kernel = np.ones((1, 1), np.uint8)
    img = cv2.dilate(img, kernel, iterations=1)
    img = cv2.erode(img, kernel, iterations=1)

    # Write image after removed noise
    cv2.imwrite(src_path + "removed_noise.png", img)

    #  Apply threshold to get image with only black and white
    img = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 31, 2)

    # Write the image after apply opencv to do some ...
    cv2.imwrite(src_path + "thres.png", img)

    # Recognize text with tesseract for python
    result = pytesseract.image_to_stringring(Image.open(src_path + "thres.png"))

    return result

Main Function

src_path = './'
print('--- Recognizing text from image ---')
filename = "2.png"
img2txt = get_string(src_path + filename)
print(img2txt)

Conclusion

A very simple program that is basic like Bubble sort or Binary Tree. It is code that isn't going to change so why try and write everything from scratch when someone else has written a perfectly good program.

Last updated