Skip to content
Advertisement

Pandas convert dummies to a new column

I have a dataframe that discretize the customers into different Q’s, which looks like:

    CustomerID_num  Q1  Q2  Q3  Q4  Q5  Country
0   12346           1   0   0   0   0   United Kingdom
2   12347           0   0   0   0   1   Iceland
9   12348           0   1   0   0   0   Finland
13  12349           0   0   0   0   1   Italy
14  12350           0   1   0   0   0   Norway

What I want to do is adding a new column, Q, to the dataframe which shows which sector this customer is in, so it looks like:

    CustomerID_num  Q1  Q2  Q3  Q4  Q5  Q    Country
0   12346           1   0   0   0   0   1    United Kingdom
2   12347           0   0   0   0   1   5    Iceland
9   12348           0   1   0   0   0   2    Finland
13  12349           0   0   0   0   1   5    Italy
14  12350           0   1   0   0   0   2    Norway

The only way I can think about is using for loop but it will give me a mess. Any other way to do this?

Advertisement

Answer

One option is to dump down into numpy:

Filter for just the Q columns:

cols = df.filter(like = 'Q')

Get the column positions that are equal to 1:

_, positions = cols.to_numpy().nonzero()
df.assign(Q = positions + 1)
    CustomerID_num  Q1  Q2  Q3  Q4  Q5         Country  Q
0            12346   1   0   0   0   0  United Kingdom  1
2            12347   0   0   0   0   1         Iceland  5
9            12348   0   1   0   0   0         Finland  2
13           12349   0   0   0   0   1           Italy  5
14           12350   0   1   0   0   0          Norway  2
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement