I maintain an API client for the TRIAS API (German Link) to retrieve local public transport information for various states / cities in Germany. Recently one of the TRIAS servers (Baden-Württemberg) started responding with an error message to requests.
When I try to send the request via curl
, the server responds just fine:
$ curl -vH "Content-Type: text/xml; charset=utf-8" -d@trias-req.xml http://www.efa-bw.de/trias * Trying 94.186.213.206:80... * Connected to www.efa-bw.de (94.186.213.206) port 80 (#0) > POST /trias HTTP/1.1 > Host: www.efa-bw.de > User-Agent: curl/7.83.1 > Accept: */* > Content-Type: text/xml; charset=utf-8 > Content-Length: 652 > * Mark bundle as not supporting multiuse < HTTP/1.1 200 OK < Date: Tue, 31 May 2022 09:35:35 GMT < Server: EFAController/10.4.25.9/BW-WW33 < Access-Control-Allow-Origin: * < Access-Control-Allow-Headers: Authorization, Content-Type < Access-Control-Allow-Methods: GET < Access-Control-Expose-Headers: Content-Security-Policy, Location < Access-Control-Max-Age: 600 < Content-Type: text/xml < Expires: Thu, 01 Jan 1970 00:00:00 GMT < Accept-Ranges: none < Content-Length: 1520 < Last-Modified: Tue, 31 May 2022 09:35:35 GMT < Set-Cookie: ServerID=bw-ww33;Path=/ < <?xml version="1.0" encoding="UTF-8"?> <trias:Trias xmlns:siri="http://www.siri.org.uk/siri" xmlns:trias="http://www.vdv.de/trias" xmlns:acsb="http://www.ifopt.org.uk/acsb" xmlns:ifopt="http://www.ifopt.org.uk/ifopt" xmlns:datex2="http://datex2.eu/schema/1_0/1_0" version="1.1"><trias:ServiceDelivery><siri:ResponseTimestamp>2022-05-31T09:35:35Z</siri:ResponseTimestamp><siri:ProducerRef>de:nvbw</siri:ProducerRef><siri:Status>true</siri:Status><trias:Language>de</trias:Language><trias:CalcTime>30</trias:CalcTime><trias:DeliveryPayload><trias:LocationInformationResponse><trias:Location><trias:Location><trias:Address><trias:AddressCode>streetID:1500001248::8222000:-1:T 1:Mannheim:T 1::T 1: 68161:ANY:DIVA_STREET:942862:5641376:MRCV:B_W:0</trias:AddressCode><trias:AddressName><trias:Text>Mannheim, T 1</trias:Text><trias:Language>de</trias:Language></trias:AddressName><trias:PostalCode> 68161</trias:PostalCode><trias:LocalityName>Mannheim</trias:LocalityName><trias:LocalityRef>8222000:-1</trias:LocalityRef><trias:StreetName>T 1</trias:StreetName></trias:A* Connection #0 to host www.efa-bw.de left intact ddress><trias:LocationName><trias:Text>Mannheim</trias:Text><trias:Language>de</trias:Language></trias:LocationName><trias:GeoPosition><trias:Longitude>8.46987</trias:Longitude><trias:Latitude>49.49121</trias:Latitude></trias:GeoPosition></trias:Location><trias:Complete>true</trias:Complete><trias:Probability>0.763999999</trias:Probability></trias:Location></trias:LocationInformationResponse></trias:DeliveryPayload></trias:ServiceDelivery></trias:Trias>
However, it fails with requests.post()
, while it works with urllib.request.urlopen()
:
$ python efabw.py 502 <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <html><head> <title>502 Proxy Error</title> </head><body> <h1>Proxy Error</h1> <p>The proxy server received an invalid response from an upstream server.<br /> The proxy server could not handle the request<p>Reason: <strong>Error reading from remote server</strong></p></p> </body></html> 653 200 b'<?xml version="1.0" encoding="UTF-8"?>n<trias:Trias xmlns:siri="http://www.siri.org.uk/siri" xmlns:trias="http://www.vdv.de/trias" xmlns:acsb="http://www.ifopt.org.uk/acsb" xmlns:ifopt="http://www.ifopt.org.uk/ifopt" xmlns:datex2="http://datex2.eu/schema/1_0/1_0" version="1.1"><trias:ServiceDelivery><siri:ResponseTimestamp>2022-05-31T09:37:02Z</siri:ResponseTimestamp><siri:ProducerRef>de:nvbw</siri:ProducerRef><siri:Status>true</siri:Status><trias:Language>de</trias:Language><trias:CalcTime>30</trias:CalcTime><trias:DeliveryPayload><trias:LocationInformationResponse><trias:Location><trias:Location><trias:Address><trias:AddressCode>streetID:1500001248::8222000:-1:T 1:Mannheim:T 1::T 1: 68161:ANY:DIVA_STREET:942862:5641376:MRCV:B_W:0</trias:AddressCode><trias:AddressName><trias:Text>Mannheim, T 1</trias:Text><trias:Language>de</trias:Language></trias:AddressName><trias:PostalCode> 68161</trias:PostalCode><trias:LocalityName>Mannheim</trias:LocalityName><trias:LocalityRef>8222000:-1</trias:LocalityRef><trias:StreetName>T 1</trias:StreetName></trias:Address><trias:LocationName><trias:Text>Mannheim</trias:Text><trias:Language>de</trias:Language></trias:LocationName><trias:GeoPosition><trias:Longitude>8.46987</trias:Longitude><trias:Latitude>49.49121</trias:Latitude></trias:GeoPosition></trias:Location><trias:Complete>true</trias:Complete><trias:Probability>0.763999999</trias:Probability></trias:Location></trias:LocationInformationResponse></trias:DeliveryPayload></trias:ServiceDelivery></trias:Trias>'
The respective code is:
#! /usr/bin/env python3 from urllib.request import Request, urlopen from requests import post URL = 'http://www.efa-bw.de/trias' HEADERS = {'Content-Type': 'text/xml'} def main(): with open('trias-req.xml', 'rb') as file: xml = file.read() response = post(URL, data=xml, headers=HEADERS) print(response.status_code, response.text, len(xml)) request = Request(URL, data=xml, headers=HEADERS) with urlopen(request) as response: print(response.status, response.read()) if __name__ == '__main__': main()
Why does the request fail with requests.post()
only?
What can I do to debug this further?
Other API servers respond fine to the request with requests.post()
Advertisement
Answer
It turned out to be the user agent. After inspecting the headers with tcpdump
, I found, that the request fails with user agent python-requests/2.27.1
but succeeds with Python-urllib/3.10
and curl/7.83.1
:
#! /usr/bin/env python3 from urllib.request import Request, urlopen from requests import post URL = 'http://www.efa-bw.de/trias' HEADERS_REQUESTS = {'Content-Type': 'text/xml', 'User-Agent': 'Python-urllib/3.10'} HEADERS_URLLIB = {'Content-Type': 'text/xml'} def main(): with open('trias-req.xml', 'rb') as file: xml = file.read() response = post(URL, data=xml, headers=HEADERS_REQUESTS) print(response.status_code, response.text, len(xml)) request = Request(URL, data=xml, headers=HEADERS_URLLIB) with urlopen(request) as response: print(response.status, response.read()) if __name__ == '__main__': main()
$ python efabw.py 200 <?xml version="1.0" encoding="UTF-8"?> <trias:Trias xmlns:siri="http://www.siri.org.uk/siri" xmlns:trias="http://www.vdv.de/trias" xmlns:acsb="http://www.ifopt.org.uk/acsb" xmlns:ifopt="http://www.ifopt.org.uk/ifopt" xmlns:datex2="http://datex2.eu/schema/1_0/1_0" version="1.1"><trias:ServiceDelivery><siri:ResponseTimestamp>2022-05-31T10:09:06Z</siri:ResponseTimestamp><siri:ProducerRef>de:nvbw</siri:ProducerRef><siri:Status>true</siri:Status><trias:Language>de</trias:Language><trias:CalcTime>45</trias:CalcTime><trias:DeliveryPayload><trias:LocationInformationResponse><trias:Location><trias:Location><trias:Address><trias:AddressCode>streetID:1500001248::8222000:-1:T 1:Mannheim:T 1::T 1: 68161:ANY:DIVA_STREET:942862:5641376:MRCV:B_W:0</trias:AddressCode><trias:AddressName><trias:Text>Mannheim, T 1</trias:Text><trias:Language>de</trias:Language></trias:AddressName><trias:PostalCode> 68161</trias:PostalCode><trias:LocalityName>Mannheim</trias:LocalityName><trias:LocalityRef>8222000:-1</trias:LocalityRef><trias:StreetName>T 1</trias:StreetName></trias:Address><trias:LocationName><trias:Text>Mannheim</trias:Text><trias:Language>de</trias:Language></trias:LocationName><trias:GeoPosition><trias:Longitude>8.46987</trias:Longitude><trias:Latitude>49.49121</trias:Latitude></trias:GeoPosition></trias:Location><trias:Complete>true</trias:Complete><trias:Probability>0.763999999</trias:Probability></trias:Location></trias:LocationInformationResponse></trias:DeliveryPayload></trias:ServiceDelivery></trias:Trias> 653 200 b'<?xml version="1.0" encoding="UTF-8"?>n<trias:Trias xmlns:siri="http://www.siri.org.uk/siri" xmlns:trias="http://www.vdv.de/trias" xmlns:acsb="http://www.ifopt.org.uk/acsb" xmlns:ifopt="http://www.ifopt.org.uk/ifopt" xmlns:datex2="http://datex2.eu/schema/1_0/1_0" version="1.1"><trias:ServiceDelivery><siri:ResponseTimestamp>2022-05-31T10:09:06Z</siri:ResponseTimestamp><siri:ProducerRef>de:nvbw</siri:ProducerRef><siri:Status>true</siri:Status><trias:Language>de</trias:Language><trias:CalcTime>19</trias:CalcTime><trias:DeliveryPayload><trias:LocationInformationResponse><trias:Location><trias:Location><trias:Address><trias:AddressCode>streetID:1500001248::8222000:-1:T 1:Mannheim:T 1::T 1: 68161:ANY:DIVA_STREET:942862:5641376:MRCV:B_W:0</trias:AddressCode><trias:AddressName><trias:Text>Mannheim, T 1</trias:Text><trias:Language>de</trias:Language></trias:AddressName><trias:PostalCode> 68161</trias:PostalCode><trias:LocalityName>Mannheim</trias:LocalityName><trias:LocalityRef>8222000:-1</trias:LocalityRef><trias:StreetName>T 1</trias:StreetName></trias:Address><trias:LocationName><trias:Text>Mannheim</trias:Text><trias:Language>de</trias:Language></trias:LocationName><trias:GeoPosition><trias:Longitude>8.46987</trias:Longitude><trias:Latitude>49.49121</trias:Latitude></trias:GeoPosition></trias:Location><trias:Complete>true</trias:Complete><trias:Probability>0.763999999</trias:Probability></trias:Location></trias:LocationInformationResponse></trias:DeliveryPayload></trias:ServiceDelivery></trias:Trias>'