I have a sample dataframe1
City Employee Manager email date Florida city XYZ ABC abc@example.com 2021-03-01 Huston XYZ ABC abc@example.com 2021-03-02 Miami city MNO ABC abc@example.com 2021-03-01 Washington DC KLM XYZ xyz@example.com 2021-03-01 Pittsburg EFG XYZ xyz@example.com 2021-03-01 California HIJ XYZ xyz@example.com 2021-03-01
How do I iterate through the manager column in such a way that rows/data of one manager should be saved in excel and the rows/data of another manager should be saved in another excel?
desired result
first excel data (filename: ABC – XYZ, MNO)
city | employee | manager | email | date ------------------------------------------------------------------------------- Florida city | XYZ | ABC | abc@example.com | 2021-03-01 Huston | XYZ | ABC | abc@example.com | 2021-03-02 Miami city | MNO | ABC | abc@example.com | 2021-03-01
second excel data (filename: XYZ – KLM, EFG, HIJ)
city | employee | manager | email | date --------------------------------------------------------------------------- Washington DC | KLM | XYZ | xyz@example.com | 2021-03-01 Pittsburg | EFG | XYZ | xyz@example.com | 2021-03-01 California | HIJ | XYZ | xyz@example.com | 2021-03-01
Advertisement
Answer
Actually this can be pretty straightforward with groupby
:
#Import import pandas as pd # Set your path path = r'G:xxxxxx' # Groupby 'manager' column and export different csv's with names in manager for x,y in df.groupby('Manager'): y.to_excel(f'{path}{x}.xlsx', index = False)
This will save in your path all the different xlsx
files named with the unique values in your manager column, structured like:
# File 1 ABC City Employee Manager email date 0 Florida city XYZ ABC abc@example.com 2021-03-01 1 Huston XYZ ABC abc@example.com 2021-03-02 2 Miami city MNO ABC abc@example.com 2021-03-01 # File 2 XYZ City Employee Manager email date 3 Washington DC KLM XYZ xyz@example.com 2021-03-01 4 Pittsburg EFG XYZ xyz@example.com 2021-03-01 5 California HIJ XYZ xyz@example.com 2021-03-01