I want to create a Dataframe with the columns as the Days of the week, and each person’s name and corresponding start/end times. So far I can get the data from the dictionary to the Dataframe, but I am struggling to get the index correct. I managed to get a bit of help from this question Python – how to load nested dictionary into Pandas dataframe?, but I’m not sure how to rearange the indexes to achieve what I need.
I know I need to .drop
the Times
column altogether, but after that I am coming up short of ideas. I’m thinking somehow set row[0]
as the index, but I’m unsure if that would be the best way to do this.
Here’s what I’ve got working so far:
data:
employees = {"Mon":{ "Name":{"0":"John","1":"Nick","2":"Ali","3":"Joseph"}, "Start":{"0":"9a","1":"9a","2":"3p","3":"3p"}, "End":{"0":"5p","1":"5p","2":"11p","3":"11p"}, }, "Tues":{ "Name":{"0":"John","1":"Nick","2":"Ali","3":"Joseph"}, "Start":{"0":"9a","1":"9a","2":"3p","3":"3p"}, "End":{"0":"5p","1":"5p","2":"11p","3":"11p"}, } }
code to create the dataframe:
dict_df = pd.DataFrame.from_dict({(i,j): employees[i][j] for i in employees.keys() for j in employees[i].keys()}, orient='index').reset_index().rename( {'level_0': 'Day', 'level_1': 'Name'}, axis=1)
current output:
Day Name 0 1 2 3 0 Mon Name John Nick Ali Joseph 1 Mon Start 9a 9a 3p 3p 2 Mon End 5p 5p 11p 11p 3 Tues Name John Nick Ali Joseph 4 Tues Start 9a 9a 3p 3p 5 Tues End 5p 5p 11p 11p
required output:
Days Times John Nick Ali Joseph 0 Mon Start 9a 9a 3p 3p 1 Mon End 5p 5p 11p 11p 2 Tues Start 9a 9a 3p 3p 3 Tues End 5p 5p 11p 11p
Advertisement
Answer
There isn’t a column equivalent to set_index
, so you can create a DataFrame from each sub-dictionary and then Transpose and concat
them together in the end. You can then assign the key as the day value.
In the end if you want a MultiIndex just .set_index(['Days', 'Times'])
import pandas as pd df = pd.concat([(pd.DataFrame(d).set_index('Name').T .rename_axis(index='Times', columns=None) .reset_index() .assign(Days=day)) for day,d in employees.items()], ignore_index=True)
print(df) Times John Nick Ali Joseph Days 0 Start 9a 9a 3p 3p Mon 1 End 5p 5p 11p 11p Mon 2 Start 9a 9a 3p 3p Tues 3 End 5p 5p 11p 11p Tues