Skip to content
Advertisement

split a workbook into different workbooks with worksheets using python pandas

I have a list of transactions from the last 7 years in one big excel file. I m trying to create an excel workbook for each year that includes each months as worksheet.

Im using a column called ‘date’ that has each transactions recorded as MM/DD/YYY. I split that column to single out my years and months but Im stuck on how I can use them to get multiple workbooks (YYYYmoney.xlsx) that contain worksheets for each month.

here’s what I was able to get to but i got stuck when it came to nesting my for loop. Can anyone help?

import pandas as pd


#location of the file you want to work on
 file1 = '.money.xlsx'


#make it a dataframe
df1 = pd.read_excel(file1)

#create 3 columns from splitting the date column
df1[["month", "day", "year"]] = df1["Date"].str.split("/", expand = True)

#list each year and each month to make sure you did it right

each_year = df1['year'].unique()
print(each_year)

each_month = df1['month'].unique()
print(each_month)

#create separate workbook for each year with sheet for each month

for value in each_year:
    df00 = df1[df1['year'] == value]
    output_file_name = str(value) + 'money' + '.xlsx'
    df00.to_excel(output_file_name, index=False)
    for monthly in each_month:    
        writer = pd.ExcelWriter(output_file_name, engine='xlsxwriter')
        df00[monthly].to_excel(writer, sheet_name=monthly, index=False)
        writer.save()
  
print('DataFrame written to Excel File successfully.')

Advertisement

Answer

I know this is a bit late, but perhaps better late than never…

I’m not sure what issue you ran into b/c it doesn’t really say, but I suspect your issue was b/c you created a new writer for each sheet instead of each workbook. You also tried to write all months for all years and didn’t create a new DF for each each year.

Without testing, I can’t say this is 100% working code, but I’d rearrange what you have to something like below. This should get you close.

for value in each_year:
    dfyear = df1[df1['year'] == value]
    output_file_name = str(value)+'money.xlsx'
    writer = pd.ExcelWriter(output_file_name, engine='xlsxwriter')
    each_month = dfyear['month'].unique()
    for month in each_month:    
        dfyear[month].to_excel(writer, sheet_name=str(month), index=False)
    writer.save()

print('DataFrame written to Excel File successfully.')

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement