I am looking for a native way to parse an http request in Python 3.
This question shows a way to do it in Python 2, but uses now deprecated modules, (and Python 2) and I am looking for a way to do it in Python 3.
I would mainly like to just figure out what resource is requested and parse the headers and from a simple request. (i.e):
GET /index.html HTTP/1.1 Host: localhost Connection: keep-alive Cache-Control: max-age=0 Upgrade-Insecure-Requests: 1 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8 Accept-Encoding: gzip, deflate, sdch Accept-Language: en-US,en;q=0.8
Can someone show me a basic way to parse this request?
Advertisement
Answer
You could use the email.message.Message
class from the email
module in the standard library.
By modifying the answer from the question you linked, below is a Python3 example of parsing HTTP headers.
Suppose you wanted to create a dictionary containing all of your header fields:
import email import pprint from io import StringIO request_string = 'GET / HTTP/1.1rnHost: localhostrnConnection: keep-alivernCache-Control: max-age=0rnUpgrade-Insecure-Requests: 1rnUser-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36rnAccept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8rnAccept-Encoding: gzip, deflate, sdchrnAccept-Language: en-US,en;q=0.8' # pop the first line so we only process headers _, headers = request_string.split('rn', 1) # construct a message from the request string message = email.message_from_file(StringIO(headers)) # construct a dictionary containing the headers headers = dict(message.items()) # pretty-print the dictionary of headers pprint.pprint(headers, width=160)
if you ran this at a python prompt, the result would look like:
{'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8', 'Accept-Encoding': 'gzip, deflate, sdch', 'Accept-Language': 'en-US,en;q=0.8', 'Cache-Control': 'max-age=0', 'Connection': 'keep-alive', 'Host': 'localhost', 'Upgrade-Insecure-Requests': '1', 'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36'}