I have a df like this one:
JavaScript
x
3
1
df = pd.DataFrame([["coffee","soda","coffee","water","soda","soda"],["paper","glass","glass","paper","paper","glass"], list('smlssm')]).T
2
df.columns = ['item','cup','size']
3
df:
JavaScript
1
8
1
item cup size
2
0 coffee paper s
3
1 soda glass m
4
2 coffee glass l
5
3 water paper s
6
4 soda paper s
7
5 soda glass m
8
I want to transform this into a df that looks like this
JavaScript
1
17
17
1
item cup size freq
2
0 coffee paper s 1
3
1 coffee paper m 0
4
2 coffee paper l 0
5
3 coffee glass s 0
6
4 coffee glass m 0
7
5 coffee glass l 1
8
6 soda paper s 1
9
7 soda paper m 0
10
8 soda paper l 0
11
9 soda glass s 0
12
10 soda glass m 2
13
11 soda glass l 0
14
. . . . .
15
. . . . .
16
. . . . .
17
So for every item i want a row with the possible combinations of cup and size and an additional row with the frequency.
What is the proper way to do this using pandas?
Advertisement
Answer
Let’s try:
Add a frequency column to the dataframe to indicate individual rows are worth 1 each.
groupby sum
to get the current count in the DataFrame.Create a MultiIndex from the
unique
values in each column.Use the new
midx
toreindex
with afill_value=0
so that freq gets filled with 0 when created by the new index.reset_index
to convert the index back into columns.
JavaScript
1
14
14
1
# Columns to Reindex
2
idx_cols = ['item', 'cup', 'size']
3
# Create MultIndex With Unique Values
4
midx = pd.MultiIndex.from_product(
5
[df[c].unique() for c in idx_cols],
6
names=idx_cols
7
)
8
df = (
9
df.assign(freq=1) # Add Freq Column initialzed to 1
10
.groupby(idx_cols)['freq'].sum() # Groupby and Sum freq
11
.reindex(midx, fill_value=0) # reindex
12
.reset_index() # reset_index
13
)
14
df
:
JavaScript
1
20
20
1
item cup size freq
2
0 coffee paper s 1
3
1 coffee paper m 0
4
2 coffee paper l 0
5
3 coffee glass s 0
6
4 coffee glass m 0
7
5 coffee glass l 1
8
6 soda paper s 1
9
7 soda paper m 0
10
8 soda paper l 0
11
9 soda glass s 0
12
10 soda glass m 2
13
11 soda glass l 0
14
12 water paper s 1
15
13 water paper m 0
16
14 water paper l 0
17
15 water glass s 0
18
16 water glass m 0
19
17 water glass l 0
20