I’m coding a program which’ll take an image for an input, check it against images in a database and output the image with the same hash
However, when using hash("imagepath")
2 of the same images give different hashes, even when the only difference is the image’s name, which makes me believe the name is the issue
Is there a way to easily ignore the name of the image? (png)
Advertisement
Answer
How I solved it: I ended up not using “hashing” but the average pixel by scrambeling pieces of code together, and then find an image with the same average pixel (the average pixels are in a list so it gets the index which it then uses to find a name)
import requests #Database of possible image average pixels clone_imgs = [88.0465, 46.2568, 102.6426 ...] image = <image url> img_data = requests.get(image).content with open('image.png', 'wb') as handler: #Download the image as "image.png" (Replace "image.png" with the path where you want to save it) handler.write(img_data) img = Image.open(r"image.png") #Open the image for reading img = img.resize((100, 100), Image.ANTIALIAS) #A series of compressions to the image img = img.convert("L") img_pixel_data = list(spawn.getdata()) img_avg_pixel = sum(spawn_pixel_data)/len(spawn_pixel_data) #Get the average pixel values clone_img_index = clone_imgs.index(img_avg_pixel) #Find the same pixel value in the database
This worked for me but it has a few downsides:
- The images need to be 100% the same in color (A single pixel off can ruin it)
- One of these average pixels can make an infinite amount of images, my database only contained 800 so it still worked (However I had to go from compression to 10×10 to 100×100 to not end up with clones)