Skip to content
Advertisement

Count value pairings from different columns in a DataFrame with Pandas

I have a df like this one:

df = pd.DataFrame([["coffee","soda","coffee","water","soda","soda"],["paper","glass","glass","paper","paper","glass"], list('smlssm')]).T
df.columns = ['item','cup','size']

df:

    item    cup    size
0   coffee  paper   s
1   soda    glass   m
2   coffee  glass   l
3   water   paper   s
4   soda    paper   s
5   soda    glass   m

I want to transform this into a df that looks like this

    item    cup    size  freq
0   coffee  paper   s     1
1   coffee  paper   m     0
2   coffee  paper   l     0
3   coffee  glass   s     0
4   coffee  glass   m     0
5   coffee  glass   l     1
6   soda    paper   s     1
7   soda    paper   m     0
8   soda    paper   l     0
9   soda    glass   s     0
10  soda    glass   m     2
11  soda    glass   l     0
.    .       .      .     .
.    .       .      .     .
.    .       .      .     .

So for every item i want a row with the possible combinations of cup and size and an additional row with the frequency.

What is the proper way to do this using pandas?

Advertisement

Answer

Let’s try:

  1. Add a frequency column to the dataframe to indicate individual rows are worth 1 each.

  2. groupby sum to get the current count in the DataFrame.

  3. Create a MultiIndex from the unique values in each column.

  4. Use the new midx to reindex with a fill_value=0 so that freq gets filled with 0 when created by the new index.

  5. reset_index to convert the index back into columns.

# Columns to Reindex
idx_cols = ['item', 'cup', 'size']
# Create MultIndex With Unique Values
midx = pd.MultiIndex.from_product(
    [df[c].unique() for c in idx_cols],
    names=idx_cols
)
df = (
    df.assign(freq=1)  # Add Freq Column initialzed to 1
        .groupby(idx_cols)['freq'].sum()  # Groupby and Sum freq
        .reindex(midx, fill_value=0)  # reindex
        .reset_index()  # reset_index
)

df:

      item    cup size  freq
0   coffee  paper    s     1
1   coffee  paper    m     0
2   coffee  paper    l     0
3   coffee  glass    s     0
4   coffee  glass    m     0
5   coffee  glass    l     1
6     soda  paper    s     1
7     soda  paper    m     0
8     soda  paper    l     0
9     soda  glass    s     0
10    soda  glass    m     2
11    soda  glass    l     0
12   water  paper    s     1
13   water  paper    m     0
14   water  paper    l     0
15   water  glass    s     0
16   water  glass    m     0
17   water  glass    l     0
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement