Requests.get() does not seem to be returning the expected bytes for Wikipedia image URLs, such as https://upload.wikimedia.org/wikipedia/commons/0/05/20100726_Kalamitsi_Beach_Ionian_Sea_Lefkada_island_Greece.jpg:
JavaScript
x
10
10
1
import wikipedia
2
import requests
3
4
page = wikipedia.page("beach")
5
first_image_link = page.images[0]
6
req = requests.get(first_image_link)
7
req.content
8
9
b'<!DOCTYPE html>n<html lang="en">n<meta charset="utf-8">n<title>Wikimedia Error</title>n<style>n*...
10
Advertisement
Answer
Most websites block requests that come in without a valid browser as a User-Agent. Wikimedia is one such.
JavaScript
1
5
1
import requests
2
headers={'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.0.0 Safari/537.36'}
3
res = requests.get('https://upload.wikimedia.org/wikipedia/commons/0/05/20100726_Kalamitsi_Beach_Ionian_Sea_Lefkada_island_Greece.jpg', headers=headers)
4
res.content
5
which will give you expected output