Skip to content
Advertisement

How to parse HTML table that is inside div and not table in Python

I am trying to parse the table from this website. I started with just the Username column and with the help I got on stackoverflow, I was able to get the content of Username with the following code:

JavaScript

which gives me

JavaScript

My ultimate goal is to populate the entire table with [Rank, Grade, Username, Uploads, Followers, Following, Likes]

I have read a few articles on Parsing HTML Tables in Python with BeautifulSoup and pandas but it didn’t work since this is not defined as a table in the source. What are some of the alternatives to get this as a table in Python?

Advertisement

Answer

You can use this code how to load the HTML from file to soup and then parse the table into dataframe:

JavaScript

Prints:

JavaScript

And saves data.csv (screenshot from LibreOffice):

enter image description here


EDIT: To get URL username:

JavaScript
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement