Even after implementing the enable_download_headless(driver, path) that was suggested in the following thread, the download of the file is incorrect. While the non headless version can always download the file of the site correctly the headless version downloads an “chargeinfo.xhtml” excerpt, which is the last extension of the link of the download page “https://www.xxxxx.de/xxx/chargeinfo.xhtml”. Interestingly, when I call the suggested enable_download_headless(driver, path) in non headless mode, it downloads the “chargeinfo.xhtml” as well.
Also, taking a screenshot before clicking the download shows the same webpage layout as in non headless.
Any help is highly appreciated.
Here is my driver setup:
def excerpt(): ## declare driver and allow options = webdriver.ChromeOptions() ##declaring headless options.add_argument("--headless") user_agent = 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.50 Safari/537.36' options.add_argument(f'user-agent={user_agent}') options.add_argument('--ignore-certificate-errors') options.add_argument('--allow-running-insecure-content') options.add_argument("--window-size=1920,1080") driver_path = "path/to/chromedriver" driver = webdriver.Chrome(driver_path,options=options) ####cause the non headless version to also download "chargeinfo.xhtml" enable_download_headless(driver, "/Download/Path/") driver.get("https://www.xxxxx.de/xxx/chargeinfo.xhtml") time.sleep(10) driver.find_element('xpath', "//span[@class='ui-button-text ui-c' and contains(text(), 'Download')]").click() def enable_download_headless(browser,download_dir): browser.command_executor._commands["send_command"] = ("POST", '/session/$sessionId/chromium/send_command') params = {'cmd':'Page.setDownloadBehavior', 'params': {'behavior': 'allow', 'downloadPath': download_dir}} browser.execute("send_command", params)
Advertisement
Answer
If anyone is having a similar problem, for me the only way to get this running was switching to get the request response body. I clicked the download button with selenium and than fetched the response as follows:
for request in driver.requests: if request.response: if request.url == "https://www.xxxxx.de/xxx/chargeinfo.xhtml": print( request.url, request.response.status_code, request.response.body ) with open('out.pdf', 'wb') as f: f.write(request.response.body)