Pandas: Is better aggregation possible

Question

I have sample dataframe above. I wish to calculate percentage True for each date. I am able to do as below. But, feel it can be done with groupby + agg. Is it possible? My attempt: Answer You can do groupby like this: Output: You can get both percentages for T and F with crosstab: Output: Note 1: Extra commen…

Accepted Answer

You can do groupby like this:df['T/F'].eq('T').groupby([df['Date']]).mean()Output:Date01-01-2019    1.002-01-2019    1.003-01-2019    1.004-01-2019    0.005-01-2019    0.0Name: T/F, dtype: float64You can get both percentages for T and F with crosstab:pd.crosstab(df.Date, df['T/F'], normalize='index')Output:T/F           F    TDate                01-01-2019  0.0  1.002-01-2019  0.0  1.003-01-2019  0.0  1.004-01-2019  1.0  0.005-01-2019  1.0  0.0Note 1: Extra comment to your code: The counts per date can be obtained by:counts = pd.crosstab(df['Date'], df['T/F'])Then the percentage of T can be:counts['per T'] = counts['T']/counts.sum(axis=1)Note 2: Don&#8217;t do groupby().agg({'col1': sum, 'col2':sum}) because:sum is python native, and is slowagg is slow(er), and only useful when you want to perform different operations to different columns.Do: groupby()[['col1','col2']].sum()Note 3: All of the solutions above give percentage in scale 0-1. If you want scale 0-100, you know what to do.

Advertisement

Answer