Skip to content
Advertisement

Using a for loop with beautiful soup and if statements to populate a dataframe

Goal: The goal of my project is to use BeautifulSoup aka bs4 to scrape only necessary data from an HTML file and import it into excel. The html file is heavily formatted so unfortunately I haven’t been able to tailor more common solutions to my needs.

What I have tried: I have been able to parse the HTML file to the point where I am only pulling the tables I need, and I am able to detect every column of data and print it. In example, if there are a total of 18 columns and 3 rows of data, the code will output 54 times with each piece of table data going from row 1 col 1 to row 3 col 18.

My code is as follows:

JavaScript

Example of data output currently achieved

row 1 column 1 (first string in list)
row 1 column 2
row 1 column 3 …
row 3 column 17
row 3 column 18 (last string in list)

The current code creates a single list with the data outputted above, though I am unable to figure out a way to convert that list into a pandas dataframe tying each list output to the appropriate row/column. Could anyone provide ideas on how to do this or how to otherwise rework my code to import this data into a dataframe?

Advertisement

Answer

it’s all messed up: your function iserror does in fact check if there’s no error (and i don’t think it works at all). what you call tables are rows and you don’t need to enumerate

as you haven’t provided the data, i made only rough tests. but it’s a bit cleaner

JavaScript
User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement