I’m trying to scrape a website, but it gives me an error.
I’m using the following code:
JavaScript
x
10
10
1
import urllib.request
2
from bs4 import BeautifulSoup
3
4
get = urllib.request.urlopen("https://www.website.com/")
5
html = get.read()
6
7
soup = BeautifulSoup(html)
8
9
print(soup)
10
And I’m getting the following error:
JavaScript
1
4
1
File "C:Python34libencodingscp1252.py", line 19, in encode
2
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
3
UnicodeEncodeError: 'charmap' codec can't encode characters in position 70924-70950: character maps to <undefined>
4
What can I do to fix this?
Advertisement
Answer
I fixed it by adding .encode("utf-8")
to soup
.
That means that print(soup)
becomes print(soup.encode("utf-8"))
.