How to detect and find checkboxes in a form using Python OpenCV?

Question

I have several images for which I need to do OMR by detecting checkboxes using computer vision. I'm using findContours to draw contours only on the checkboxes in scanned document. But the algorithm extracts each and every contours of the text. Input Image: Answer Obtain binary image. Load the image, grayscale, Gaussian blur, and Otsu's threshold to obtain a binary

Accepted Answer

Obtain binary image. Load the image, grayscale, Gaussian blur, and Otsu’s threshold to obtain a binary black/white image.Remove small noise particles. Find contours and filter using contour area filtering to remove noise.Repair checkbox horizontal and vertical walls. This step is optional but in the case where the checkboxes may be damaged, we repair the walls for easier detection. The idea is to create a rectangular kernel then perform morphological operations.Detect checkboxes. From here we find contours, obtain the bounding rectangle coordinates, and filter using shape approximation + aspect ratio. The idea is that a checkbox is essentially a square so its contour dimensions should be within a range.Input image -> Binary imageDetected checkboxes highlighted in greenCheckboxes: 52Another input image -> Binary imageDetected checkboxes highlighted in greenCheckboxes: 2Codeimport cv2# Load image, convert to grayscale, Gaussian blur, Otsu's thresholdimage = cv2.imread('1.jpg')original = image.copy()gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)blur = cv2.GaussianBlur(gray, (3,3), 0)thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]# Find contours and filter using contour area filtering to remove noisecnts, _ = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)[-2:]AREA_THRESHOLD = 10for c in cnts: area = cv2.contourArea(c) if area < AREA_THRESHOLD: cv2.drawContours(thresh, [c], -1, 0, -1)# Repair checkbox horizontal and vertical wallsrepair_kernel1 = cv2.getStructuringElement(cv2.MORPH_RECT, (5,1))repair = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, repair_kernel1, iterations=1)repair_kernel2 = cv2.getStructuringElement(cv2.MORPH_RECT, (1,5))repair = cv2.morphologyEx(repair, cv2.MORPH_CLOSE, repair_kernel2, iterations=1)# Detect checkboxes using shape approximation and aspect ratio filteringcheckbox_contours = []cnts, _ = cv2.findContours(repair, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)[-2:]for c in cnts: peri = cv2.arcLength(c, True) approx = cv2.approxPolyDP(c, 0.035 * peri, True) x,y,w,h = cv2.boundingRect(approx) aspect_ratio = w / float(h) if len(approx) == 4 and (aspect_ratio >= 0.8 and aspect_ratio <= 1.2): cv2.rectangle(original, (x, y), (x + w, y + h), (36,255,12), 3) checkbox_contours.append(c)print('Checkboxes:', len(checkbox_contours))cv2.imshow('thresh', thresh)cv2.imshow('repair', repair)cv2.imshow('original', original)cv2.waitKey()

Advertisement

Answer