I’m trying to scrape a website, but it gives me an error.
I’m using the following code:
import urllib.request from bs4 import BeautifulSoup get = urllib.request.urlopen("https://www.website.com/") html = get.read() soup = BeautifulSoup(html) print(soup)
And I’m getting the following error:
File "C:Python34libencodingscp1252.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: 'charmap' codec can't encode characters in position 70924-70950: character maps to <undefined>
What can I do to fix this?
Advertisement
Answer
I fixed it by adding .encode("utf-8")
to soup
.
That means that print(soup)
becomes print(soup.encode("utf-8"))
.