A GET request downloads following output (checked the response with Chrome Dev Tools):
HTML output
<style> .xdebug-error { display:none; } </style> <link rel="stylesheet" href="assets/css/xyz.css"> some numbers <script> $(function(){ $('#a_gen div').html( '<a href="https://example.com/uid" target="_blank">click to make an account</a><div id="gen_warn">Do Not Leave This Page Until Account Is Made!</div>' ); }); </script>
Output via response.content
When I am printing response.content
to the console or to a file I am getting something like this:
b'x833x01x00xe4Rxa7shxd8x15Px80x0cn x95xa6x9axdexd0xe8xa4x9aZxb6xdcx81xdbx07xad2Ixbb5x7fnxb38xb0xb4x15h[xe0x05xdcx02x0b,sxd7x8f|x95:xd7x90<6xb7xb7=?xf8xa60xa8~x19xa0x85Vx05<x8f{xbftxc4nxe1D1xb6xd9x1ex98xfdx94xeaxfbx10xf8x82xeexfbx02x05xf5xeex07x9eZxd5}?5x88xcaR[x94Zb]jxb4xebb[ rxb9NHxb4xe7x07xc3Ox07x89hxc6xcdr~x13xd1H&x9fK{_\xxb5x80!xc3xf9xc8x15tx11x04xf3xb9x07x04xf8x1dA`!xa5xa1xd2xbdx0cxfexf5pxa8xfaxf9xf9Yxa5x0exbbx83xe1xb0Fx96xe9T(xb7x1c&X\Xp5x9cxefxa8xdf&xf5zxb3xd6nf=x10xe4*xfbx88xa5x98x8cxb1xbcxc0xf2x027x82xe5E]x82xe5x85xceYxe0xb0x100Yxa8a`?<xacgx98xccxaex07gxba~x!x97xb7xc6x0fYxeacxf1x85}x9exc6Xxaex93Yxc7nx98xdcxd3axccx061x99x99*xf5!x0bxc3xe9x97x1cx99a2xbbxcaxbf}xf0x01x8d5xfex01hxe7hx9exea}xa3xa9xeax19xc2i}@x154xa5x8e~G6xx80xb2x8dtxeex80xbeyxe8!Kx98xa4xb2Y:x7fx83x16xb0xd7Oxd5cxa9xc1x8cxa3x03x0fxd0x0exd4x0fxf8,xa0uR-@x0f,p(xe2>x85xd6>xdaxabx06$sx85n"xfa_xe8&xa2=xc1xd7=xe7=x18x18x03'
Output via response.text
With response.text
I got this (as depicted in image):
Original Code
All variables are already defined:
s = requests.Session() r = s.get(url,headers = headers) print(r.text) if (r.status_code == 200): print("Generated Successfully") with open("Alt.txt", 'a') as f: f.write(str(r.text) + 'n') else: print("BAD Request " + str(r.status_code)) s.cookies.clear()
How can the plain-text response be written in a text file or to console?
Advertisement
Answer
For evaluating a response from an arbitrary GET request, you should always evaluate the response.headers
.
The header with key Content-Type
tells you something about the MIME type like text/html
or application/json
of a response and its encoding like UTF-8
.
In your case the result of response.headers['Content-Type']
probably would return "text/html; charset=UTF-8"
.
So you know, that you need to decode the response from UTF-8
as Parvat. R commented by r.content.decode('utf-8')
.
Here we can
- either use
response.encoding
to dynamically decode theresponse.text
based on response’s given encoding - or we can simply use
response.content
to get the bytes as binary representation (e.g.b'x833x01'
)
Since you claim the response was text/HTML (as seen in browser), you could simply decode the textual representation and append it to the text-file:
s = requests.Session() r = s.get(url,headers = headers) print(r.text) if (r.status_code == 200): print("Generated Successfully") # detect encoding and decode respectively print("Response encoding", r.encoding) body_text = r.text.decode(r.encoding) with open("Alt.txt", 'a') as f: f.write(str(body_text) + 'n') # print body as string to file else: print("BAD Request " + str(r.status_code)) s.cookies.clear()
See also: python requests.get() returns improperly decoded text instead of UTF-8?