I have, for example, the following image:
Is there any way of identifying the embedded image between the text?
I know you can use OpenCV’s match_by_template
but that is if you already know what the embedded image is. I have spent a lot of time searching how to identify an image within an image in Python when not already knowing what the embedded image is, but I have not been able to find anything. Please let me know what I can do.
Advertisement
Answer
Here’s my attempt. This is how it works:
- To detect text, check how different pixels are from their neighbors. This is done using an absolute difference.
- The previous step only detects the edges of text. Expand this with a gaussian blur.
- Threshold this, and remove text.
- Crop remaining whitespace.
It uses numpy, opencv, and scipy to do it.
Full code:
JavaScript
x
38
38
1
import numpy as np
2
import cv2 as cv
3
import matplotlib.pyplot as plt
4
import scipy.ndimage
5
6
img_orig = cv.imread('image_extract.png')
7
8
def find_text(gray, gaussian_size_px=10, text_threshold=10):
9
flat = gray.flatten().astype('int16')
10
# difference each pixel against the pixel 1 position forward
11
differenced_image = np.abs(flat - np.roll(flat, 1)).reshape(gray.shape)
12
differenced_image = scipy.ndimage.gaussian_filter(differenced_image, sigma=gaussian_size_px)
13
is_text = differenced_image > text_threshold
14
return is_text
15
16
17
def remove_text(img, minpool_size=3):
18
is_text = find_text(img)
19
image_only = np.where(is_text, 0, img)
20
# filter out small bright pixels with convolved minimum
21
image_only = scipy.ndimage.minimum_filter(image_only, size=minpool_size)
22
return image_only
23
24
25
def find_subimage(img_orig):
26
gray = cv.cvtColor(img_orig, cv.COLOR_BGR2GRAY)
27
# invert colors so white = 0 and black = 255
28
gray = np.max(gray) - gray
29
30
image_only = remove_text(gray)
31
32
coords = cv.findNonZero(image_only)
33
x, y, w, h = cv.boundingRect(coords)
34
cropped = img_orig[y:y+h, x:x+w]
35
return cropped
36
37
plt.imshow(find_subimage(img_orig))
38
Output: