Skip to content
Advertisement

Beautifulsoup: extracting td list in table

I’m stuck with a BeautifulSoup problem that I think is simple but I can’t seem to solve. It is about extracting each td from the following table to create a loop and a list:

JavaScript

What I need is to create a dictionary with some elements of each tr to create a dataframe later. I would like to have a list with:

  • Team: Barcelona
  • Name: Player 1
  • Number: 16
  • Minute: 88
  • Team: Real Madrid
  • Name: Player 2
  • Number: 8
  • Minute: 12

As you can see, there are some tds that I don’t need and I’d also like to ‘jump’ on them for my final df.

I’ve tried with this code (I only put a simplified example) but it doesn’t work because I always take the name of the 1st team:

JavaScript

This is the output I get (I also would like to remove the code but if I try .text or .get_text() I have the error ‘NoneType’ object has no attribute ‘text’):

JavaScript

I sense that I’m very close to the solution but I am stuck and I can’t move forward. Thanks in advance for your help!

Advertisement

Answer

If you feel like learning something new, you don’t even need bs4 (well, sort of). All you need is pandas (you get a dataframe out of the box) to get this:

JavaScript

With this:

JavaScript

The code also dumps your table to a .csv file:

enter image description here

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement