This is a time table, columns=hour, rows=weekday, data=subject [weekday x hour]
1 2 3 4 5 6 7 Name Monday Project Project Project Data Science Embedded Systems Data Mining Industrial Psychology Tuesday Project Project Project Project Data Science Industrial Psychology Embedded Systems Wednesday Data Science Project Project Project Project Project Project Thursday Data Mining Industrial Psychology Embedded Systems Data Mining Project Project Project Friday Industrial Psychology Embedded Systems Data Science Data Mining Project Project Project
How do you generate a pandas.Dataframe
where, rows=weekday, columns=subject, data = subject frequency in the corresponding weekday?
Required table: [weekday x subject]
Data Mining, Data Science, Embedded Systems, Industrial Psychology, Project Name Monday 1 1 1 1 3 Tuesday ... Wednesday Thursday Friday
self.file = 'timetable.csv' self.sdf = pd.read_csv(self.file, header=0, index_col="Name") print(self.sdf.to_string()) self.subject_frequency = self.sdf.apply(pd.value_counts) print(self.subject_frequency.to_string()) self.subject_frequency["sum"] = self.subject_frequency.sum(axis=1)
Advertisement
Answer
Use melt
to flatten your dataframe then pivot_table
to reshape your dataframe:
out = ( df.melt(var_name='Freq', value_name='Data', ignore_index=False).assign(variable=1) .pivot_table('Freq', 'Name', 'Data', fill_value=0, aggfunc='count') .loc[df.index] # sort by original index: Monday > Thuesday > ... )
Output:
>>> out Data Data Mining Data Science Embedded Systems Industrial Psychology Project Name Monday 1 1 1 1 3 Tuesday 0 1 1 1 4 Wednesday 0 1 0 0 6 Thursday 2 0 1 1 3 Friday 1 1 1 1 3