How to scrape data from html table with python?

Question

I am trying to scrape data from the html tables on this page and export it to a csv. The only success i&#8217;ve had is with extracting the headers. I thought the problem might be with the page not fully loading before the data is scraped, hence my use of the the &#8216;requests_html&#8217; library, but the i…

Accepted Answer

Your assumption is correct, the contents are only reloaded when the corresponding areas are opened, so you should also start at this point and retrieve the data via these separate requests.Simply change the value of id parameter for https://www.fidelitypensionmanagers.com/Services/GetPriceHistory?id=1&format=jsonYou can get the ids from the onclick:

Examplesimplest approach would be to use pandas in my opinion:import pandas as pdpd.read_json('https://www.fidelitypensionmanagers.com/Services/GetPriceHistory?id=1&format=json').to_csv('myfile.csv', index=False)OutputSCHEME_ID,VALUATION_DATE,BID_PRICE_BF,OFFER_PRICE_BF,TOTAL_UNITS,ASSET_VALUE,STOCKS_VALUE,UNQUOTED_VALUE,UNQUOTED_APPR,UNINVESTED_CASH,UNDISTR_INCOME,MM_VALUE,STAMP_DUTIES,NSE_CSCS,BROKERAGE_FEES,TOTAL_BID_VALUE,TOTAL_OFFER_VALUE,BID_PRICE,OFFER_PRICE,APPR_TO_DATE,ID,PRICE_MOVEMENT,PRICE_CHANGE1,/Date(1665644400000)/,0,0,22197476839.3793,93586560820.23,6562586060.8,0,0,395651878.65,0,9143340045.7,0,0,0,93586782102.51,93586782102.51,4.2161,4.2161,105797375.82,77484,,1,/Date(1665558000000)/,0,0,22178176379.013,93502175535.94,6549468002.14,0,0,274281137.71,0,9179939634.67,0,0,0,93503191613.92,93503191613.92,4.216,4.216,92679317.16,77463,,1,/Date(1665471600000)/,0,0,22173955797.4287,93469684659.81,6557409158.97,0,0,251510018.56,0,9177109809.92,0,0,0,93469875872.9,93469875872.9,4.2153,4.2153,100620473.99,77453,,1,/Date(1665385200000)/,0,0,22164552155.3263,93418356788.97,6565624234.22,0,0,168155797.81,0,9174279985.2,0,0,0,93419154424.27,93419154424.27,4.2148,4.2148,108835549.24,77413,,1,/Date(1665298800000)/,0,0,22164552155.3263,93396581110.67,6565624234.22,0,0,168155797.81,0,9171485395.41,0,0,0,93396989872.11,93396989872.11,4.2138,4.2138,108835549.24,77393,,1,/Date(1665212400000)/,0,0,22164552155.3263,93374808849.27,6565624234.22,0,0,168155797.81,0,9168690805.6,0,0,0,93374825319.96,93374825319.96,4.2128,4.2128,108835549.24,77373,,1,/Date(1665126000000)/,0,0,22164552155.3263,93353040003.64,6565624234.22,0,0,168155797.81,0,9165896215.82,0,0,0,93352660767.8,93352660767.8,4.2118,4.2118,108835549.24,77353,,...

Advertisement

Answer

Example

Output