I have a pandas df
of the following format
JavaScript
x
8
1
MATERIAL DATE HIGH LOW
2
AAA 2022-01-01 10 0
3
AAA 2022-01-02 0 0
4
AAA 2022-01-03 5 2
5
BBB 2022-01-01 0 0
6
BBB 2022-01-02 10 5
7
BBB 2022-01-03 8 4
8
I am looking to transform it such that I land up with the below result
JavaScript
1
4
1
MATERIAL HIGH_COUNT LOW_COUNT
2
AAA 2 1
3
BBB 2 2
4
Essentially for "HIGH_COUNT"
and "LOW_COUNT"
I want to count the number of occurrences that column was greater than 0, grouped by "MATERIAL"
.
I have tried to do df.groupby(['MATERIAL']).agg<xxx>
but I am unsure of the agg
function to use here.
Edit:
I used
JavaScript
1
2
1
df.groupby(['MATERIAL']).agg({'HIGH':'count', 'LOW':'count})
2
but this counts even the 0
rows.
Advertisement
Answer
You could create a boolean DataFrame and groupby
+ sum
:
JavaScript
1
2
1
out = df[['HIGH', 'LOW']].gt(0).groupby(df['MATERIAL']).sum().add_suffix('_COUNT').reset_index()
2
Output:
JavaScript
1
4
1
MATERIAL HIGH_COUNT LOW_COUNT
2
0 AAA 2 1
3
1 BBB 2 2
4