I would like to return each data frame from each URL appended into one single data frame. When I print it within the function, I get the result I desire. The problem is when I try assign a variable to the data frame, it only adds the final data frame. Running this function prints my desired result:
import pandas as pd urllist = ['https://basketball.realgm.com/nba/boxscore/2022-04-09/Indiana-at-Philadelphia/388705', 'https://basketball.realgm.com/nba/boxscore/2022-04-09/New-Orleans-at-Memphis/388704', 'https://basketball.realgm.com/nba/boxscore/2022-04-09/Golden-State-at-San-Antonio/388706', 'https://basketball.realgm.com/nba/boxscore/2022-04-09/Sacramento-at-LA-Clippers/388703'] def Boxscore(URL): for x in URL: box_list = pd.read_html(x) box1 = box_list[3] box2 = box_list[4] fullbox = pd.concat([box1, box2]) print(fullbox) Boxscore(urllist)
But when I try to assign it a value, it only prints the final data frame, instead of all of them together.
import pandas as pd urllist = ['https://basketball.realgm.com/nba/boxscore/2022-04-09/Indiana-at-Philadelphia/388705', 'https://basketball.realgm.com/nba/boxscore/2022-04-09/New-Orleans-at-Memphis/388704', 'https://basketball.realgm.com/nba/boxscore/2022-04-09/Golden-State-at-San-Antonio/388706', 'https://basketball.realgm.com/nba/boxscore/2022-04-09/Sacramento-at-LA-Clippers/388703'] def Boxscore(URL): for x in URL: box_list = pd.read_html(x) box1 = box_list[3] box2 = box_list[4] fullbox = pd.concat([box1, box2]) return fullbox fullboxscore = Boxscore(urllist) print(fullboxscore)
How can I append each data frame into one, and name that new data frame as a variable? Please help, thanks!
Advertisement
Answer
Try creating an empty list to append to and then concat
def Boxscore(URL: list) -> pd.DataFrame: dfs = [] # empty list for x in URL: box_list = pd.read_html(x) box1 = box_list[3] box2 = box_list[4] fullbox = pd.concat([box1, box2]) dfs.append(fullbox) # append frame to list return pd.concat(dfs).reset_index(drop=True) # concat frames and return # call you function and assign it to a variable fullboxscore = Boxscore(urllist)