Skip to content
Advertisement

scraping table from a website result as empty

I am trying to scrape the main table with tag :

JavaScript

from following website using ‘BeautifulSoup’ library, but the code returns empty [] while printing soup returns html string and request status is 200. I found out that when i use browser ‘inspect element’ tool i can see the table tag but in “view page source” the table tag which is part of “app-root” tag is not shown. (you see <app-root></app-root> which is empty). Besides there is no “json” file in the webpage’s components to extract data from it. Please help me how can I scrape the table data.

JavaScript

Advertisement

Answer

The table is in the source HTML but kinda hidden and then rendered by JavaScript. It’s in one of the <script> tags. This can be located with bs4 and then parsed with regex. Finally, the table data can be dumped to json.loads then to a pandas and to a .csv file, but since I don’t know any Persian, you’d have to see if it’s of any use.

Just by looking at some values, I think it is.

Oh, and this can be done without selenium.

Here’s how:

JavaScript

This outputs a huge file that looks like this:

enter image description here

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement