So I need to download the images of every coin on the list on CoinGecko, so I wrote the following code:
import requests from bs4 import BeautifulSoup from os.path import basename def getdata(url): r = requests.get(url) return r.text htmldata = getdata("https://www.coingecko.com/en") soup = BeautifulSoup(htmldata, 'html.parser') for item1 in soup.select('.coin-icon img'): link = item1.get('data-src').replace('thumb', 'thumb_2x') with open(basename(link), "wb") as f: f.write(requests.get(link).content)
However, I need to save the images with their names being the same as the ticker of the coin of that list from CoinGecko (rename bitcoin.png?1547033579
to BTC.png
, ethereum.png?1595348880
to ETH.png
, and so forth). There are over 7000 images that need to be renamed, and many of them have quite unique names, so slicing does not work here.
What is the way to do it?
Advertisement
Answer
I was browsing the html file and I found that the tag you are looking at has an alt parameter that has the ticker on the end of the string.
<div class="coin-icon mr-2 center flex-column"> <img class="" alt="bitcoin (BTC)" data-src="https://assets.coingecko.com/coins/images/1/thumb/bitcoin.png?1547033579" data-srcset="https://assets.coingecko.com/coins/images/1/thumb_2x/bitcoin.png?1547033579 2x" src="https://assets.coingecko.com/coins/images/1/thumb/bitcoin.png?1547033579" srcset="https://assets.coingecko.com/coins/images/1/thumb_2x/bitcoin.png?1547033579 2x"> </div>
So we can use that to get the correct name like so:
import requests from bs4 import BeautifulSoup from os.path import basename def getdata(url): r = requests.get(url) return r.text htmldata = getdata("https://www.coingecko.com/en") soup = BeautifulSoup(htmldata, 'html.parser') for item1 in soup.select('.coin-icon img'): link = item1.get('data-src').replace('thumb', 'thumb_2x') raw_name = item1.get('alt') name = raw_name[raw_name.find('(') + 1:-1] with open(basename(name), "wb") as f: f.write(requests.get(link).content)
We are basically extracting the value between the parenthesis using string slicing.