I have the following clear binary image, and I want to read the digits using tesseract
.
My problem is that tesseract
is reading only the first digit (5)!
How can I make tesseract
read the full sequence?
import cv2 import pytesseract gray = cv2.imread('input.jpg', 0) text = pytesseract.image_to_string(gray, config='outputbase digits') print(text)
Output: 5 < o x o c >
Advertisement
Answer
You will have to do some amount of preprocessing before you push the image directly to pytesseract
for extraction of text. One thing that comes to mind is using binary_fill_holes
to fill the area inside edges. Here is an example of what you can do.
from skimage import io, util, feature from scipy import ndimage as ndi import matplotlib.pyplot as plt import pytesseract import numpy as np #Import image img = io.imread('jbAsM.jpg', as_gray=True) #Preprocessing imginv = util.invert(img) #Invert image #Loop and fill holes iteratively for i in range(2): edges = feature.canny(imginv) #find edges imginv = ndi.binary_fill_holes(edges) #fill holes in edges fill_inv = util.invert(imginv) #invert again plt.imshow(fill_inv, cmap='gray') #Image to text text = pytesseract.image_to_string(fill_inv, config='outputbase digits') print('Extracted Text ->',text)
Extracted Text -> 5113
EDIT: No idea why pytesseract
is predicting the last digit as 3 (weird!!)
You will have to find your own preprocessing pipeline that suits the other images. I would recommend looking at image segmentation
and edge filling methods
.