I have a list of transactions from the last 7 years in one big excel file. I m trying to create an excel workbook for each year that includes each months as worksheet.
Im using a column called ‘date’ that has each transactions recorded as MM/DD/YYY. I split that column to single out my years and months but Im stuck on how I can use them to get multiple workbooks (YYYYmoney.xlsx) that contain worksheets for each month.
here’s what I was able to get to but i got stuck when it came to nesting my for loop. Can anyone help?
import pandas as pd
#location of the file you want to work on
file1 = '.money.xlsx'
#make it a dataframe
df1 = pd.read_excel(file1)
#create 3 columns from splitting the date column
df1[["month", "day", "year"]] = df1["Date"].str.split("/", expand = True)
#list each year and each month to make sure you did it right
each_year = df1['year'].unique()
print(each_year)
each_month = df1['month'].unique()
print(each_month)
#create separate workbook for each year with sheet for each month
for value in each_year:
df00 = df1[df1['year'] == value]
output_file_name = str(value) + 'money' + '.xlsx'
df00.to_excel(output_file_name, index=False)
for monthly in each_month:
writer = pd.ExcelWriter(output_file_name, engine='xlsxwriter')
df00[monthly].to_excel(writer, sheet_name=monthly, index=False)
writer.save()
print('DataFrame written to Excel File successfully.')
Advertisement
Answer
I know this is a bit late, but perhaps better late than never…
I’m not sure what issue you ran into b/c it doesn’t really say, but I suspect your issue was b/c you created a new writer for each sheet instead of each workbook. You also tried to write all months for all years and didn’t create a new DF for each each year.
Without testing, I can’t say this is 100% working code, but I’d rearrange what you have to something like below. This should get you close.
for value in each_year:
dfyear = df1[df1['year'] == value]
output_file_name = str(value)+'money.xlsx'
writer = pd.ExcelWriter(output_file_name, engine='xlsxwriter')
each_month = dfyear['month'].unique()
for month in each_month:
dfyear[month].to_excel(writer, sheet_name=str(month), index=False)
writer.save()
print('DataFrame written to Excel File successfully.')