Skip to content
Advertisement

Modify HTML with BeautifulSoup using data from Pandas table

My understanding is that BeautifulSoup is more for getting data rather than modifying, though it can perform that. I have a skeleton HTML tree called ‘tree’, and want to insert data from a database query to modify the HTML. The amount of data inserted is variable. I’m aware of the method BeautifulSoup.new_tag() but am not sure how to integrate with multiple data ponits.

tree

JavaScript

Modify to:

JavaScript

They are added according the table df:

JavaScript

From the HTML, there are 2 rows to add to the tag. In this case, let’s say I only want to add group A (though I’d want to add all groups generally). Using Pandas, I can groupby to create a new table called ‘grouped’ with Group as the index.

grouped

JavaScript

So My psuedocode would be do something like this. Let soup =

JavaScript

I understand the above logic, except the internals of how to add new tags with BeautifulSoup.

Advertisement

Answer

You can use pandas.groupby and GroupBy.apply to group your column Group and then create the td tags by looping over the rows.

Say you have the dataframe :

JavaScript

On applying the groupby function we get

JavaScript

Output

enter image description here

enter image description here

Now, say you want to concat the result for all groups all you need to do is :

JavaScript

which gives us the expected output :

JavaScript

Now that you got the string, you can insert it to your html using python formatted string or concat

JavaScript
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement