Skip to content
Advertisement

Ignore image name while getting hash

I’m coding a program which’ll take an image for an input, check it against images in a database and output the image with the same hash

However, when using hash("imagepath") 2 of the same images give different hashes, even when the only difference is the image’s name, which makes me believe the name is the issue

Is there a way to easily ignore the name of the image? (png)

Advertisement

Answer

How I solved it: I ended up not using “hashing” but the average pixel by scrambeling pieces of code together, and then find an image with the same average pixel (the average pixels are in a list so it gets the index which it then uses to find a name)

import requests

#Database of possible image average pixels
clone_imgs = [88.0465, 46.2568, 102.6426 ...]

image = <image url>
img_data = requests.get(image).content
with open('image.png', 'wb') as handler: #Download the image as "image.png" (Replace "image.png" with the path where you want to save it)
    handler.write(img_data)
img = Image.open(r"image.png") #Open the image for reading
img = img.resize((100, 100), Image.ANTIALIAS) #A series of compressions to the image
img = img.convert("L")
img_pixel_data = list(spawn.getdata())
img_avg_pixel = sum(spawn_pixel_data)/len(spawn_pixel_data) #Get the average pixel values

clone_img_index = clone_imgs.index(img_avg_pixel) #Find the same pixel value in the database

This worked for me but it has a few downsides:

  1. The images need to be 100% the same in color (A single pixel off can ruin it)
  2. One of these average pixels can make an infinite amount of images, my database only contained 800 so it still worked (However I had to go from compression to 10×10 to 100×100 to not end up with clones)
Advertisement