I have a sample dataframe1
City Employee Manager email date Florida city XYZ ABC abc@example.com 2021-03-01 Huston XYZ ABC abc@example.com 2021-03-02 Miami city MNO ABC abc@example.com 2021-03-01 Washington DC KLM XYZ xyz@example.com 2021-03-01 Pittsburg EFG XYZ xyz@example.com 2021-03-01 California HIJ XYZ xyz@example.com 2021-03-01
How do I iterate through the manager column in such a way that rows/data of one manager should be saved in excel and the rows/data of another manager should be saved in another excel?
desired result
first excel data (filename: ABC – XYZ, MNO)
city | employee | manager | email | date ------------------------------------------------------------------------------- Florida city | XYZ | ABC | abc@example.com | 2021-03-01 Huston | XYZ | ABC | abc@example.com | 2021-03-02 Miami city | MNO | ABC | abc@example.com | 2021-03-01
second excel data (filename: XYZ – KLM, EFG, HIJ)
city | employee | manager | email | date --------------------------------------------------------------------------- Washington DC | KLM | XYZ | xyz@example.com | 2021-03-01 Pittsburg | EFG | XYZ | xyz@example.com | 2021-03-01 California | HIJ | XYZ | xyz@example.com | 2021-03-01
Advertisement
Answer
Actually this can be pretty straightforward with groupby:
#Import
import pandas as pd
# Set your path
path = r'G:xxxxxx'
# Groupby 'manager' column and export different csv's with names in manager
for x,y in df.groupby('Manager'):
y.to_excel(f'{path}{x}.xlsx', index = False)
This will save in your path all the different xlsx files named with the unique values in your manager column, structured like:
# File 1
ABC
City Employee Manager email date
0 Florida city XYZ ABC abc@example.com 2021-03-01
1 Huston XYZ ABC abc@example.com 2021-03-02
2 Miami city MNO ABC abc@example.com 2021-03-01
# File 2
XYZ
City Employee Manager email date
3 Washington DC KLM XYZ xyz@example.com 2021-03-01
4 Pittsburg EFG XYZ xyz@example.com 2021-03-01
5 California HIJ XYZ xyz@example.com 2021-03-01