Skip to content
Advertisement

Curl works but urllib doesn’t [closed]

Whenever I curl this, I’m able to get the entire webpage. However, when I use the urllib or even mechanize library in Python, I get a 403 error. Any reason why?

Advertisement

Answer

Try this ,

import urllib2
from BeautifulSoup import BeautifulSoup
site= "http://www.economist.com/blogs/schumpeter/2014/04/alstom-block"
header = {'User-Agent': 'Mozilla/5.0'}
req = urllib2.Request(site,headers=header)
page = urllib2.urlopen(req)
soup = BeautifulSoup(page)
print soup

Output:

    <!DOCTYPE html>
    <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en" dir="ltr" xmlns:og="http://ogp.me/ns#" xmlns:fb="https://www.facebook.com/2008/fbml">
    <head>
....
...
..
Advertisement