I have a directory with a single image of a baseball in it, image is 1.jpg. I use cv2 to read in the image . I then define a path to write the image back into the same directory as 2.jpg. So 1.jpg and 2.jpg are identical. Then for each image I calculate a “difference” hash of length 256 using the function get_hash. I then print out the hash for each image. They are almost identical but differ by at least 1 bit. Can not figure out why. Thought it could it be due to JPG compression when the image was copied so I also ran the code using png format for both images and still got different hash values. Any insight would be appreciated. Code is shown below
def get_hash(fpath, hash_length): dim = int(math.sqrt(hash_length)) # with hash_length=256 dim=16 r_str='' img=cv2.imread(fpath,0) # read image as gray scale image img=cv2.resize(img, (dim,dim), interpolation = cv2.INTER_NEAREST) img=img.flatten() # now a 256 bit vector list2=list(img) for col in range (0,len(list2)-1): if(list2[col]>list2[col+1]): value='1' else: value='0' r_str=r_str + value return r_str def match(value1, value2, distance): # returns True is the number of mismatches in the hashes is less than distance # with distance=0 returns True only if hashes are identical mismatch_count=0 for i in range(0,len(value1) ): if value1[i] !=value2[i]: mismatch_count +=1 if mismatch_count>distance: return False else: return True path_to_image=r'C:Tempballsdup31.jpg' img=cv2.imread(path_to_image) path_to_write_image=r'C:Tempballsdup32.jpg' cv2.imwrite(path_to_write_image, img) # write the identical image to directory with file name 2.jpg hash_length = 256 h1=get_hash(path_to_image, hash_length) h2=get_hash(path_to_write_image, hash_length) print (h1) print (h2) distance = 0 # both hashes must match identically m = match(h1, h2, distance) print (m) # should be true since the images are identical but returns false # because there is a single bit difference in the two hashes
256 length hash to long to put here but here is the region in which the two hash values differ by 1 bit (6th bit from the end)
hash for 1.jpg 00000000000000000000011000000000000010001001000000110000000010000010001 hash for 2.jpg 00000000100000000000011000000000000010001001000000110000000010000110001
Advertisement
Answer
[JPG]
The saved image 2.jpg
is different from the original image 1.jpg
.
You can compare the images online.
[BMP]
I’ve trid to re-save image as bmp, so they are equal at all, then Their hash values are also equal.
[PNG]
When convert to png, the images are equal, but I found the bit depth are different.