I need to count occurrences of a certain value (let’s assume it’s 3) in a range of columns per each case. To do so I wrote a script as below:
JavaScript
x
13
13
1
import pandas as pd
2
import numpy as np
3
4
objsourcedf = pd.DataFrame({"a": [1, 2, 2], "b": [3, 1, 1],
5
"c": [3, 2, 1], "d": [4, 3, 8]})
6
print(objsourcedf)
7
8
objauxdf = objsourcedf.transpose()
9
objauxdf.loc["counts"] = np.sum(objauxdf == 3)
10
11
objsourcedf = objsourcedf.assign(counts=list(objauxdf.loc["counts"]))
12
print(objsourcedf)
13
First print
is:
JavaScript
1
5
1
a b c d
2
0 1 3 3 4
3
1 2 1 2 3
4
2 2 1 1 8
5
Second:
JavaScript
1
5
1
a b c d counts
2
0 1 3 3 4 2
3
1 2 1 2 3 1
4
2 2 1 1 8 0
5
Even though it works fine I am pretty sure there is a more pythonic way to do so. By ‘pythonic’ I mean using native, concise pandas
feature and no looping through columns/rows. For example, in SPSS there is a simple count
command so regarding this objsourcedf
this line would be:
JavaScript
1
3
1
count counts = a b c d (3).
2
execute.
3
Sadly, as a beginner in Python and pandas
I couldn’t find anything so I’m asking you if there’s a more simple way to get occurences?
Advertisement
Answer
I hope this qualifies at being “Pythonic”:
JavaScript
1
2
1
objsourcedf['count'] = objsourcedf.eq(3).sum(axis=1)
2