I have a pandas df of the following format
MATERIAL DATE HIGH LOW AAA 2022-01-01 10 0 AAA 2022-01-02 0 0 AAA 2022-01-03 5 2 BBB 2022-01-01 0 0 BBB 2022-01-02 10 5 BBB 2022-01-03 8 4
I am looking to transform it such that I land up with the below result
MATERIAL HIGH_COUNT LOW_COUNT AAA 2 1 BBB 2 2
Essentially for "HIGH_COUNT" and "LOW_COUNT" I want to count the number of occurrences that column was greater than 0, grouped by "MATERIAL".
I have tried to do df.groupby(['MATERIAL']).agg<xxx> but I am unsure of the agg function to use here.
Edit:
I used
df.groupby(['MATERIAL']).agg({'HIGH':'count', 'LOW':'count})
but this counts even the 0 rows.
Advertisement
Answer
You could create a boolean DataFrame and groupby + sum:
out = df[['HIGH', 'LOW']].gt(0).groupby(df['MATERIAL']).sum().add_suffix('_COUNT').reset_index()
Output:
MATERIAL HIGH_COUNT LOW_COUNT 0 AAA 2 1 1 BBB 2 2