I am trying to combine dataframes with 2 columns into a single dataframe. The initial dataframes are generated through a for loop and stored in a list. I am having trouble getting the data from the list of dataframes into a single dataframe. Right now when I run my code, it treats each full dataframe as a row.
def linear_reg_function(category): df = pd.read_csv(file) df = df[df['category_column'] == category]` df1 = df[['category_column', 'value_column']] df_export.append(df1) df_export = [] for category in category_list: linear_reg_function(category)
when I run this block of code I get a list of dataframes that have 2 columns. When I try to convert df_export to a dataframe, it ends up with 12 rows (the number of categories in category_list). I tried:
df_export = pd.DataFrame()
but the result was:
_
I would like to have a single dataframe with 2 columns, [Category, Value] that includes the values of all 12 categories generated in the for loop.
Advertisement
Answer
You can use pd.concat
to merge a list of DataFrames into a single big DataFrame.
appended_data = [] for infile in glob.glob("*.xlsx"): data = pandas.read_excel(infile) # store DataFrame in list appended_data.append(data) # see pd.concat documentation for more info appended_data = pd.concat(appended_data) # write DataFrame to an excel sheet appended_data.to_excel('appended.xlsx')
you can manipulate it to your proper demande