I have a pandas df
of the following format
MATERIAL DATE HIGH LOW AAA 2022-01-01 10 0 AAA 2022-01-02 0 0 AAA 2022-01-03 5 2 BBB 2022-01-01 0 0 BBB 2022-01-02 10 5 BBB 2022-01-03 8 4
I am looking to transform it such that I land up with the below result
MATERIAL HIGH_COUNT LOW_COUNT AAA 2 1 BBB 2 2
Essentially for "HIGH_COUNT"
and "LOW_COUNT"
I want to count the number of occurrences that column was greater than 0, grouped by "MATERIAL"
.
I have tried to do df.groupby(['MATERIAL']).agg<xxx>
but I am unsure of the agg
function to use here.
Edit:
I used
df.groupby(['MATERIAL']).agg({'HIGH':'count', 'LOW':'count})
but this counts even the 0
rows.
Advertisement
Answer
You could create a boolean DataFrame and groupby
+ sum
:
out = df[['HIGH', 'LOW']].gt(0).groupby(df['MATERIAL']).sum().add_suffix('_COUNT').reset_index()
Output:
MATERIAL HIGH_COUNT LOW_COUNT 0 AAA 2 1 1 BBB 2 2