I have a folder named with a certain acronym, and inside this folder you can find a certain number of Excel files. The folder’s name indicates the name of the apartment (for ex. UDC06_45) and, inside this folder, all of the Excel files’ name are composed by: the name of the apartment, followed by the name of the appliance that is located in that apartment (for ex. UDC06_45_Oven).
These Excel files are very simple DataFrames, they contain energy consumption measurements: one column named “timestamps” and one column named “Energy” (all of these measurements have a 15 min frequency). All of the Excel files inside the folder are made with the same identical structure.
My Python code takes as input only one of these Excel files at a time and makes few operations on them (resampling, time interpolation, etc.) starting with the command “pd.read_excel()”, and creates an output Excel file with “df.to_excel()” after giving it a name.
What I want to do is to apply my code automatically to all of the files in that folder. The code should take as input only the name of the folder (“UDC06_45”) and create as many output files as needed. So if the folder contains only two appliances:
- “UDC06_45_Oven”
- “UDC06_45_Fridge”
the code will elaborate them both, one after the other, and I should obtain two dinstinct Excel files as output. Their name is just composed by the input file’s name followed by “_output”:
- “UDC06_45_Oven_output”
- “UDC06_45_Fridge_output”.
In general, this must be done for every Excel file contained in that folder. If the folder contains 5 appliances, meaning 5 input Excel files, I should obtain 5 output Excel files… and so on.
How can I do it?
Advertisement
Answer
In the following code only assing your path, in my case I have used a test folder path path=r'D:test
‘ this code will create a new folder automatically in the same path.
import pandas as pd import os from glob import glob path=r'D:test' # add whatever your path is in place of 'D:test' input_folder='UDC06_45' # name of input folder output_folder=input_folder+'_out' new_path=path+'/'+output_folder if not os.path.exists(new_path): os.makedirs(new_path) files=glob(path+'/'+input_folder+'/'+'*.xlsx') for file in files: name=file.split(path+'/'+input_folder+'\')[-1].rsplit('.')[0] df=pd.read_excel(file) #do all your operations here df.to_excel(new_path+'/'+name+'_output.xlsx')