Improving image deskew using Python and OpenCV

The code I’ve produce to detect and correct skew is giving me inconsistent results. I’m currently working on a project which utilizes OCR text extraction on images (via Python and OpenCV), so removing skew is key if accurate results are desired. My code uses cv2.minAreaRect to detect skew.

The images I’m using are all identical (and will be in the future) so I’m unsure as to what is causing these inconsistencies. I’ve included two sets of before and after images (including the skew value from cv2.minAreaRect) where I applied my code, one showing successul removal of skew and showing skew was not removed (looks like it added even more skew).

Image 1 Before (-87.88721466064453)

Image 1 After (successful deskew)

Image 2 Before (-5.766754150390625)

Image 2 After (unsuccessful deskew)

My code is below. Note: I’ve worked with many more images than those I’ve included here. The detected skew thus far has always been in the ranges [-10, 0) or (-90, -80], so I attempted to account for this in my code.

    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    img_gray = cv2.bitwise_not(img_gray)
    
    thresh = cv2.threshold(img_gray, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
    coords = np.column_stack(np.where(thresh > 0))
    angle = cv2.minAreaRect(coords)[-1] 
      
    if (angle < 0 and angle >= -10):
        angle = -angle #this was intended to undo skew for values in [-10, 0) by simply rotating using the opposite sign
    else:
        angle = (90 + angle)/2  
     
    (h, w) = img.shape[:2]
    center = (w // 2, h // 2)
    
    M = cv2.getRotationMatrix2D(center, angle, 1.0)
    deskewed = cv2.warpAffine(img, M, (w, h), flags = cv2.INTER_CUBIC, borderMode = cv2.BORDER_REPLICATE)

I’ve looked through various posts and articles to find an adequate solution, but have been unsuccessful. This post was the most helpful in understanding the skew values, but even then I couldn’t get very far.

Answer

A very good text deskew tool can be found in Python Wand, which uses ImageMagick. It is based upon the Radon transform.

Form 1:

Form 2:

from wand.image import Image
from wand.display import display


with Image(filename='form1.png') as img:
    img.deskew(0.4*img.quantum_range)
    img.save(filename='form1_deskew.png')
    display(img)

with Image(filename='form2.png') as img:
    img.deskew(0.4*img.quantum_range)
    img.save(filename='form2_deskew.png')
    display(img)

Form 1 deskewed:

Form 2 deskewed:

Advertisement

Answer