Skip to content
Advertisement

Selenium can’t download correct file in headless mode

Even after implementing the enable_download_headless(driver, path) that was suggested in the following thread, the download of the file is incorrect. While the non headless version can always download the file of the site correctly the headless version downloads an “chargeinfo.xhtml” excerpt, which is the last extension of the link of the download page “https://www.xxxxx.de/xxx/chargeinfo.xhtml”. Interestingly, when I call the suggested enable_download_headless(driver, path) in non headless mode, it downloads the “chargeinfo.xhtml” as well.

Also, taking a screenshot before clicking the download shows the same webpage layout as in non headless.

Any help is highly appreciated.

Here is my driver setup:

def excerpt():
    ## declare driver and allow
    options = webdriver.ChromeOptions()
    ##declaring headless
    options.add_argument("--headless")
    user_agent = 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.50 Safari/537.36'
    options.add_argument(f'user-agent={user_agent}')
    options.add_argument('--ignore-certificate-errors')
    options.add_argument('--allow-running-insecure-content')
    options.add_argument("--window-size=1920,1080")
    driver_path = "path/to/chromedriver"
    driver = webdriver.Chrome(driver_path,options=options)

    ####cause the non headless version to also download "chargeinfo.xhtml"
    enable_download_headless(driver, "/Download/Path/")

    driver.get("https://www.xxxxx.de/xxx/chargeinfo.xhtml")
    time.sleep(10)
    driver.find_element('xpath', "//span[@class='ui-button-text ui-c' and contains(text(), 'Download')]").click()

def enable_download_headless(browser,download_dir):
    browser.command_executor._commands["send_command"] = ("POST", '/session/$sessionId/chromium/send_command')
    params = {'cmd':'Page.setDownloadBehavior', 'params': {'behavior': 'allow', 'downloadPath': download_dir}}
    browser.execute("send_command", params)

Advertisement

Answer

If anyone is having a similar problem, for me the only way to get this running was switching to get the request response body. I clicked the download button with selenium and than fetched the response as follows:

    for request in driver.requests:
    if request.response:
        if request.url == "https://www.xxxxx.de/xxx/chargeinfo.xhtml":
            print(
                request.url,
                request.response.status_code,
                request.response.body
            )

            with open('out.pdf', 'wb') as f:
                f.write(request.response.body)
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement