python urllib2 and unicode

Question

I would like to collect information from the results given by a search engine. But I can only write text instead of unicode in the query part. give this error Answer Encode the Unicode data to UTF-8, then URL-encode: Demo: Using urllib.urlencode() to build the parameters is easier, but you can also just escap…

Accepted Answer

Encode the Unicode data to UTF-8, then URL-encode:from urllib import urlencodeimport urllib2params = {'where': 'nexearch', 'query': a.encode('utf8')}params = urlencode(params)url = "http://search.naver.com/search.naver?" + paramsresponse = urllib2.urlopen(url)Demo:>>> from urllib import urlencode>>> a = u"바둑">>> params = {'where': 'nexearch', 'query': a.encode('utf8')}>>> params = urlencode(params)>>> params'query=%EB%B0%94%EB%91%91&where=nexearch'>>> url = "http://search.naver.com/search.naver?" + params>>> url'http://search.naver.com/search.naver?query=%EB%B0%94%EB%91%91&where=nexearch'Using urllib.urlencode() to build the parameters is easier, but you can also just escape the query value with urllib.quote_plus():from urllib import quote_plusencoded_a = quote_plus(a.encode('utf8'))url = "http://search.naver.com/search.naver?where=nexearch&query=%s" % encoded_a

Advertisement

Answer