How to scrape table with flight data, avoiding an empty result?

Question

I&#8217;m trying to extract a table from a webpage and have tried a number of alternatives, but the table always seems to remain empty. Two of what I thought were the most promising sets of code are attached below. Any means of extracting the data from the webpage would be considered as helpful. I have also i…

Accepted Answer

Best practice is and first shot scraping table data should go with pandas.read_html(), it works in most cases, needs adjustments in some cases and only fails in specific ones.Issue here is, that a user-agent is needed with requests to avoid the 403, so we have to help pandas with that:requests.get('http://www.flightradar24.com/data/aircraft/ja11jc',         headers={'User-Agent': 'some user agent string'}).text     )[0]Now the table could be scraped, but have to be transformed a bit, cause that is what the browser will do while rendering &#8211; .dropna(axis=1) drops columns with NaN values and [:-1] slices the last row, that contains non relevant information:requests.get('http://www.flightradar24.com/data/aircraft/ja11jc',         headers={'User-Agent': 'some user agent string'}).text     )[0].dropna(axis=1)[:-1]You could also use selenium give it some time.sleep(3) while browser renders table in final form and process the driver.page_source but in my opinion this is a bit to much, in this case.Exampleimport pandas as pdimport requestsdf = pd.read_html(        requests.get('http://www.flightradar24.com/data/aircraft/ja11jc',         headers={'User-Agent': 'some user agent string'}).text     )[0].dropna(axis=1)[:-1]df.columns = ['DATE','FROM', 'TO', 'FLIGHT', 'FLIGHT TIME', 'STD', 'ATD', 'STA','STATUS']dfOutputDATEFROMTOFLIGHTFLIGHT TIMESTDATDSTASTATUS010 Dec 2022Tokunoshima (TKN)Kagoshima (KOJ)JL3798—10:00—11:10Scheduled110 Dec 2022Amami (ASJ)Tokunoshima (TKN)JL3843—08:55—09:30Scheduled&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;5803 Dec 2022Amami (ASJ)Kagoshima (KOJ)JL37240:5601:4502:0202:50Landed 02:585903 Dec 2022Kagoshima (KOJ)Amami (ASJ)JL37251:0600:0000:0901:15Landed 01:14

	DATE	FROM	TO	FLIGHT	FLIGHT TIME	STD	ATD	STA	STATUS
0	10 Dec 2022	Tokunoshima (TKN)	Kagoshima (KOJ)	JL3798	—	10:00	—	11:10	Scheduled
1	10 Dec 2022	Amami (ASJ)	Tokunoshima (TKN)	JL3843	—	08:55	—	09:30	Scheduled
…	…	…	…	…	…	…	…	…	…
58	03 Dec 2022	Amami (ASJ)	Kagoshima (KOJ)	JL3724	0:56	01:45	02:02	02:50	Landed 02:58
59	03 Dec 2022	Kagoshima (KOJ)	Amami (ASJ)	JL3725	1:06	00:00	00:09	01:15	Landed 01:14

Advertisement

Answer

Example

Output