Skip to content
Advertisement

Pandas’ read_html not reading html tables

I am trying to see if I can use, and only use, Pandas’ read_html function to scrape HTML tables from the following website: https://www.baseball-reference.com/teams/ATL/2021.shtml

I can fulfil my needs using selenium/bs but want to see if I can scrape this site’s tables with just pd.read_html alone.

Currently, pd.read_html returns the first two tables, but is not able to access tables past the second table.

Here is an example of a table ‘id’ that I am trying to access: ‘the40man’

And my code, which returns ‘ValueError: No tables found’:

pd.read_html("https://www.baseball-reference.com/teams/ATL/2021.shtml", attrs = {'id': 'the40man'})

The following code returns the first two tables, {‘id’: [‘team_batting’, ‘team_pitching’]}, but nothing more:

pd.read_html("https://www.baseball-reference.com/teams/ATL/2021.shtml")

I am asking this question out of curiosity in case I’m missing something on my end. If not, this issue is likely due to pd.read_html’s limitations.

Thank you in advance for any input/pd.read_html tips!

Advertisement

Answer

The reference.com sites have some of those tables within the comments of the html. To pull those table out, you need to first pull out the comments. Then you can iterate through those to get the table you want:

import requests
from bs4 import BeautifulSoup, Comment
import pandas as pd

url = 'https://www.baseball-reference.com/teams/ATL/2021.shtml'
result = requests.get(url).text
data = BeautifulSoup(result, 'html.parser')

comments = data.find_all(string=lambda text: isinstance(text, Comment))

tables = []
for each in comments:
    if 'table' in str(each):
        try:
            tables.append(pd.read_html(str(each), attrs = {'id': 'the40man'})[0])
            break
        except:
            continue

Output:

print(tables[0])
    Rk  Uni               Name Unnamed: 3  ...      Ht   Wt           DoB  1stYr
0    1   30        Kyle Wright      us US  ...   6' 4"  215   Oct 2, 1995   2015
1    2    0      William Woods      us US  ...   6' 3"  190  Dec 29, 1998   2018
2    3   51         Will Smith      us US  ...   6' 5"  255  Jul 10, 1989   2008
3    4   68       Tyler Matzek      us US  ...   6' 3"  230  Oct 19, 1990   2010
4    5   64    Tucker Davidson      us US  ...   6' 2"  215  Mar 25, 1996   2016
5    6   62    Touki Toussaint      us US  ...   6' 3"  215  Jun 20, 1996   2014
6    7   65    Spencer Strider      us US  ...   6' 0"  195  Oct 28, 1998   2018
7    8   15       Sean Newcomb      us US  ...   6' 5"  255  Jun 12, 1993   2012
8    9   40        Mike Soroka      ca CA  ...   6' 5"  225   Aug 4, 1997   2015
9   10   54          Max Fried      us US  ...   6' 4"  190  Jan 18, 1994   2012
10  11   77       Luke Jackson      us US  ...   6' 2"  210  Aug 24, 1991   2011
11  12   33        A.J. Minter      us US  ...   6' 0"  215   Sep 2, 1993   2013
12  13    0        Kirby Yates      us US  ...  5' 10"  205  Mar 25, 1987   2009
13  14    0        Jay Jackson      us US  ...   6' 1"  195  Oct 27, 1987   2008
14  15   71         Jacob Webb      us US  ...   6' 2"  210  Aug 15, 1993   2014
15  16   19       Huascar Ynoa      do DO  ...   6' 2"  220  May 28, 1998   2015
16  17   36       Ian Anderson      us US  ...   6' 3"  170   May 2, 1998   2016
17  18    0      Freddy Tarnok      us US  ...   6' 3"  185  Nov 24, 1998   2017
18  19   74          Dylan Lee      us US  ...   6' 3"  214   Aug 1, 1994   2015
19  20    0        Alan Rangel      mx MX  ...   6' 2"  170  Aug 21, 1997   2015
20  21    0      Brooks Wilson      us US  ...   6' 2"  205  Mar 15, 1996   2015
21  22   50     Charlie Morton      us US  ...   6' 5"  215  Nov 12, 1983   2002
22  23   14        Adam Duvall      us US  ...   6' 1"  215   Sep 4, 1988   2010
23  24   24  William Contreras      ve VE  ...   6' 0"  180  Dec 24, 1997   2015
24  25   27       Austin Riley      us US  ...   6' 3"  240   Apr 2, 1997   2015
25  26   16    Travis d'Arnaud      us US  ...   6' 2"  210  Feb 10, 1989   2007
26  27    0   Travis Demeritte      us US  ...   6' 0"  180  Sep 30, 1994   2013
27  28    0     Chadwick Tromp      aw AW  ...   5' 8"  221  Mar 21, 1995   2013
28  29   25     Cristian Pache      do DO  ...   6' 2"  215  Nov 19, 1998   2016
29  30   13   Ronald Acuna Jr.      ve VE  ...   6' 0"  205  Dec 18, 1997   2015
30  31    1       Ozzie Albies      cw CW  ...   5' 8"  165   Jan 7, 1997   2014
31  32    9      Orlando Arcia      ve VE  ...   6' 0"  187   Aug 4, 1994   2011
32  33    7     Dansby Swanson      us US  ...   6' 1"  190  Feb 11, 1994   2013
33  34    0        Drew Waters      us US  ...   6' 2"  185  Dec 30, 1998   2017
34  35   20      Marcell Ozuna      do DO  ...   6' 1"  225  Nov 12, 1990   2008
35  36    0         Manny Pina      ve VE  ...   6' 0"  222   Jun 5, 1987   2005
36  37   38  Guillermo Heredia      cu CU  ...  5' 10"  195  Jan 31, 1991   2009
37  38   66        Kyle Muller      us US  ...   6' 7"  250   Oct 7, 1997   2016
38  Rk  Uni               Name        NaN  ...      Ht   Wt           DoB  1stYr

[39 rows x 14 columns]
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement