I have script to download images from website. But it’s saves name with ‘images1, images2, images3, images4 etc’
I need to save images with orginal name. If the images name 43343.jpg i need to save with 43343.jpg I use beautifulsoup and requests for this case.
Sorry my english. It’s not my first language
JavaScript
x
112
112
1
from bs4 import *
2
import requests
3
import os
4
5
def folder_create(images):
6
try:
7
folder_name = input("Enter Folder Name:- ")
8
os.mkdir(folder_name)
9
10
except:
11
print("Folder Exist with that name!")
12
folder_create()
13
14
15
download_images(images, folder_name)
16
17
18
19
def download_images(images, folder_name):
20
count = 0
21
22
23
print(f"Total {len(images)} Image Found!")
24
25
26
if len(images) != 0:
27
for i, image in enumerate(images):
28
29
30
31
32
33
34
35
36
37
38
try:
39
40
image_link = image["data-srcset"]
41
42
43
44
except:
45
try:
46
47
image_link = image["data-src"]
48
except:
49
try:
50
51
image_link = image["data-fallback-src"]
52
except:
53
try:
54
55
image_link = image["src"]
56
57
58
except:
59
pass
60
61
62
63
try:
64
r = requests.get(image_link).content
65
try:
66
67
68
r = str(r, 'utf-8')
69
70
except UnicodeDecodeError:
71
72
73
with open(f"{folder_name}/images{i+1}.jpg", "wb+") as f:
74
f.write(r)
75
76
77
count += 1
78
except:
79
pass
80
81
82
83
84
if count == len(images):
85
print("All Images Downloaded!")
86
87
88
else:
89
print(f"Total {count} Images Downloaded Out of {len(images)}")
90
91
92
def main(url):
93
94
95
r = requests.get(url)
96
97
98
soup = BeautifulSoup(r.text, 'html.parser')
99
100
101
images = soup.findAll('img')
102
103
104
folder_create(images)
105
106
107
108
url = input("Enter URL:- ")
109
110
111
main(url) ```
112
Advertisement
Answer
Try replacing this part of your code
JavaScript
1
3
1
with open(f"{folder_name}/images{i+1}.jpg", "wb+") as f:
2
f.write(r)
3
with this:
JavaScript
1
4
1
image_name = image_link.split('/')[-1]
2
with open(f"{folder_name}/{image_name}.jpg", "wb+") as f:
3
f.write(r)
4
This will get the last element of your image_path (after the last slash), which should be the name of the image, given the rest of your code is correct and I understood the works of it correctly.
Also, I don’t know what your image_link
variable looks like but maybe you need to replace the /
with \
if your path uses backslashes.
On a last note, consider putting a bit more effort into your stackoverflow questions or your code quality in general. The sheer amount of empty lines makes the code really tedious to read and all the nesting (going deeper and deeper with try-except clauses) certainly doesn’t help with that either.
Aside from that, there are lots of better ways to achieve your intended functionality without multiple bare except clauses. In fact, you should never use a bare except statement (`except:` without an exception classname after it). But that’s a story for another time.