This is my first-time web scraping with beautiful soup and wanted to do a little project with hockey since I am a huge fan of the sport. I am a little stuck and wondering how to retrieve the header names of the stats for each player.
Here is my current code:
JavaScript
x
28
28
1
from bs4 import BeautifulSoup
2
import requests
3
import re
4
import pandas as pd
5
6
url = "http://www.espn.com/nhl/statistics/player/_/stat/points/year/2020/seasontype/2"
7
8
page = requests.get(url)
9
10
soup = BeautifulSoup(page.text, 'html.parser')
11
12
allStats = []
13
players = soup.find_all('tr', attrs={'class':re.compile('row player')})
14
for player in players:
15
stats = [stat.get_text() for stat in player.find_all('td')]
16
allStats += stats
17
body = soup.find_all('div', {"class":"wrapper"})
18
19
print(allStats)
20
21
allColumns = []
22
headers = soup.find_all('tr', attrs = {'class': 'colhead'})
23
for col in headers:
24
columns = [col.get_text() for col in headers.find_all('td')]
25
allColumns += columns
26
27
print(allColumns)
28
I am currently getting an error that says “ResultSet object has no attribute ‘%s’ for the line
JavaScript
1
2
1
headers = soup.find_all('tr', attrs = {'class': 'colhead'})
2
Eventually, I want to get a list of all of the Stat Names being tracked and use that as the columns in a pandas dataframe that lists each player and their corresponding stats.
What’s the best way to achieve this?
Thanks for your help!
Advertisement
Answer
There’s a typo in your headers
iteration that’s why you’re getting the error,
JavaScript
1
2
1
AttributeError: ResultSet object has no attribute 'find_all'. You're probably treating a list of elements like a single element. Did you call find_all() when you meant to call find()?
2
I suppose the expected result is as follows.
JavaScript
1
8
1
allColumns = []
2
headers = soup.find_all('tr', attrs = {'class': 'colhead'})
3
for header in headers:
4
columns = [head.get_text() for head in header.find_all('td')]
5
allColumns += columns
6
>>>
7
['', 'PP', 'SH', 'RK', 'PLAYER', 'TEAM', 'GP', 'G', 'A', 'PTS', '+/-', 'PIM', 'PTS/G', 'SOG', 'PCT', 'GWG', 'G', 'A', 'G', 'A', '', 'PP', 'SH', 'RK', 'PLAYER', 'TEAM', 'GP', 'G', 'A', 'PTS', '+/-', 'PIM', 'PTS/G', 'SOG', 'PCT', 'GWG', 'G', 'A', 'G', 'A', '', 'PP', 'SH', 'RK', 'PLAYER', 'TEAM', 'GP', 'G', 'A', 'PTS', '+/-', 'PIM', 'PTS/G', 'SOG', 'PCT', 'GWG', 'G', 'A', 'G', 'A', '', 'PP', 'SH', 'RK', 'PLAYER', 'TEAM', 'GP', 'G', 'A', 'PTS', '+/-', 'PIM', 'PTS/G', 'SOG', 'PCT', 'GWG', 'G', 'A', 'G', 'A']
8