I want to use excel files to store data elaborated with python. My problem is that I can’t add sheets to an existing excel file. Here I suggest a sample code to work with in order to reach this issue
import pandas as pd import numpy as np path = r"C:UsersfedelDesktopexcelDataPhD_data.xlsx" x1 = np.random.randn(100, 2) df1 = pd.DataFrame(x1) x2 = np.random.randn(100, 2) df2 = pd.DataFrame(x2) writer = pd.ExcelWriter(path, engine = 'xlsxwriter') df1.to_excel(writer, sheet_name = 'x1') df2.to_excel(writer, sheet_name = 'x2') writer.save() writer.close()
This code saves two DataFrames to two sheets, named “x1” and “x2” respectively. If I create two new DataFrames and try to use the same code to add two new sheets, ‘x3’ and ‘x4’, the original data is lost.
import pandas as pd import numpy as np path = r"C:UsersfedelDesktopexcelDataPhD_data.xlsx" x3 = np.random.randn(100, 2) df3 = pd.DataFrame(x3) x4 = np.random.randn(100, 2) df4 = pd.DataFrame(x4) writer = pd.ExcelWriter(path, engine = 'xlsxwriter') df3.to_excel(writer, sheet_name = 'x3') df4.to_excel(writer, sheet_name = 'x4') writer.save() writer.close()
I want an excel file with four sheets: ‘x1’, ‘x2’, ‘x3’, ‘x4’. I know that ‘xlsxwriter’ is not the only “engine”, there is ‘openpyxl’. I also saw there are already other people that have written about this issue, but still I can’t understand how to do that.
Here a code taken from this link
import pandas from openpyxl import load_workbook book = load_workbook('Masterfile.xlsx') writer = pandas.ExcelWriter('Masterfile.xlsx', engine='openpyxl') writer.book = book writer.sheets = dict((ws.title, ws) for ws in book.worksheets) data_filtered.to_excel(writer, "Main", cols=['Diff1', 'Diff2']) writer.save()
They say that it works, but it is hard to figure out how. I don’t understand what “ws.title”, “ws”, and “dict” are in this context.
Which is the best way to save “x1” and “x2”, then close the file, open it again and add “x3” and “x4”?
Advertisement
Answer
Thank you. I believe that a complete example could be good for anyone else who have the same issue:
import pandas as pd import numpy as np path = r"C:UsersfedelDesktopexcelDataPhD_data.xlsx" x1 = np.random.randn(100, 2) df1 = pd.DataFrame(x1) x2 = np.random.randn(100, 2) df2 = pd.DataFrame(x2) writer = pd.ExcelWriter(path, engine = 'xlsxwriter') df1.to_excel(writer, sheet_name = 'x1') df2.to_excel(writer, sheet_name = 'x2') writer.close()
Here I generate an excel file, from my understanding it does not really matter whether it is generated via the “xslxwriter” or the “openpyxl” engine.
When I want to write without loosing the original data then
import pandas as pd import numpy as np from openpyxl import load_workbook path = r"C:UsersfedelDesktopexcelDataPhD_data.xlsx" book = load_workbook(path) writer = pd.ExcelWriter(path, engine = 'openpyxl') writer.book = book x3 = np.random.randn(100, 2) df3 = pd.DataFrame(x3) x4 = np.random.randn(100, 2) df4 = pd.DataFrame(x4) df3.to_excel(writer, sheet_name = 'x3') df4.to_excel(writer, sheet_name = 'x4') writer.close()
this code do the job!