Skip to content
Advertisement

Web Scraping with table that can be changed

I have succesfully managed to set together a script now that extracts some information from a table on this website: https://www.nordpoolgroup.com/en/Market-data1/Power-system-data/Production1/Wind-Power-Prognosis/SE/Hourly/?view=table

Now, I want to do this for all dates of 2021. I suppose I have to use the input id="data-end-date" and activate some kind of button pusher, but I don’t understand how this can be done theoretically and have not managed to find any similar questions.

options = webdriver.ChromeOptions()
options.add_experimental_option("detach", True)#optional
webdriver_service = Service("./chromedriver") #Your chromedriver path
driver = webdriver.Chrome(service=webdriver_service,options=options)

data = []
driver.get('https://www.nordpoolgroup.com/en/Market-data1/Power-system-data/Production1/Wind-Power-Prognosis/SE/Hourly/?view=table')
time.sleep(3)

WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, '//*[@class="pure-button"]'))).click()
time.sleep(1)

soup = BeautifulSoup(driver.page_source,"html.parser")

df1 = pd.read_html(str(soup))[1]
df1.drop(columns=['22-11-2022', 'SE'], inplace=True)
df1.drop(range(24,29), axis=0, inplace=True)
print(df1)

Thank you.

Advertisement

Answer

You would need to control the date picker and loop over all the dates. An alternative solution would be to look into the browsers dev tools and analyze the traffic from your client to the server.

There you see that with each change in the date picker a GET request to the server gets fired and a json with all the data comes back. Luckily the GET request does not have any requirements and works even in the browser:

https://www.nordpoolgroup.com/api/marketdata/page/576?currency=,EUR,EUR,EUR&endDate=15-11-2022

And, as a url parameter, you can even pass the date you want.

The response is a json including the whole table. You just need to loop over all the dates from 2021 and parse that json data.

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement