Given a Pandas DataFrame that has multiple columns with categorical values (0 or 1), is it possible to conveniently get the value_counts for every column at the same time?
For example, suppose I generate a DataFrame as follows:
JavaScript
x
5
1
import numpy as np
2
import pandas as pd
3
np.random.seed(0)
4
df = pd.DataFrame(np.random.randint(0, 2, (10, 4)), columns=list('abcd'))
5
I can get a DataFrame like this:
JavaScript
1
12
12
1
a b c d
2
0 0 1 1 0
3
1 1 1 1 1
4
2 1 1 1 0
5
3 0 1 0 0
6
4 0 0 0 1
7
5 0 1 1 0
8
6 0 1 1 1
9
7 1 0 1 0
10
8 1 0 1 1
11
9 0 1 1 0
12
How do I conveniently get the value counts for every column and obtain the following conveniently?
JavaScript
1
4
1
a b c d
2
0 6 3 2 6
3
1 4 7 8 4
4
My current solution is:
JavaScript
1
7
1
pieces = []
2
for col in df.columns:
3
tmp_series = df[col].value_counts()
4
tmp_series.name = col
5
pieces.append(tmp_series)
6
df_value_counts = pd.concat(pieces, axis=1)
7
But there must be a simpler way, like stacking, pivoting, or groupby?
Advertisement
Answer
Just call apply
and pass pd.Series.value_counts
:
JavaScript
1
8
1
In [212]:
2
df = pd.DataFrame(np.random.randint(0, 2, (10, 4)), columns=list('abcd'))
3
df.apply(pd.Series.value_counts)
4
Out[212]:
5
a b c d
6
0 4 6 4 3
7
1 6 4 6 7
8