scraping table from a website result as empty

Question

I am trying to scrape the main table with tag : from following website using &#8216;BeautifulSoup&#8217; library, but the code returns empty [] while printing soup returns html string and request status is 200. I found out that when i use browser &#8216;inspect element&#8217; tool i can see the table tag but …

Accepted Answer

The table is in the source HTML but kinda hidden and then rendered by JavaScript. It&#8217;s in one of the <script> tags. This can be located with bs4 and then parsed with regex. Finally, the table data can be dumped to json.loads then to a pandas and to a .csv file, but since I don&#8217;t know any Persian, you&#8217;d have to see if it&#8217;s of any use.Just by looking at some values, I think it is.Oh, and this can be done without selenium.Here&#8217;s how:import pandas as pdimport jsonimport reimport requestsfrom bs4 import BeautifulSoupurl = "https://www.codal.ir/Reports/Decision.aspx?LetterSerial=T1hETjlDjOQQQaQQQfaL0Mb7uucg%3D%3D&rt=0&let=6&ct=0&ft=-1&sheetId=0"scripts = BeautifulSoup(    requests.get(url, verify=False).content,    "lxml",).find_all("script", {"type": "text/javascript"})table_data = json.loads(    re.search(r"var datasource = ({.*})", scripts[-5].string).group(1),)pd.DataFrame(    table_data["sheets"][0]["tables"][0]["cells"],).to_csv("huge_table.csv", index=False)This outputs a huge file that looks like this:

Advertisement

Answer