Skip to content
Advertisement

Tag: web-scraping

Pandas’ read_html not reading html tables

I am trying to see if I can use, and only use, Pandas’ read_html function to scrape HTML tables from the following website: https://www.baseball-reference.com/teams/ATL/2021.shtml I can fulfil my needs using selenium/bs but want to see if I can scrape this site’s tables with just pd.read_html alone. Currently, pd.read_html returns the first two tables, but is not able to access tables

For loop with lot of different Urls

totally novice in python, after many youtube videos and tutorial i’m trying to scrape basketball starting lineups from flashscore. Here’s an example of a link: https://www.flashscore.it/partita/6PN3pAhq/#informazioni-partita/formazioni As you can see in the middle there’s a code (6PN3pAhq) that corresponds to a particular match: every match has a different one, i scraped all the results (144 matches at the moment) and

Text is not printed when using selenium

This is the code I have written so far: This doesn’t print out the price, please help. This is what the output terminal looks like. I want to get this price: Answer The value of the price is blank. You should replace the tailing span[1] with span[2] in your xpath Here is the code – Output –

Why is Scrapy not following all rules / running all callbacks?

I have two spiders inheriting from a parent spider class as follows: The parse_tournament_page callback for the Rule in first spider works fine. However, the second spider only runs the parse_tournament callback from the first Rule despite the fact that the second Rule is the same as the first spider and is operating on the same page. I’m clearly missing

Advertisement