I have a dataframe that discretize the customers into different Q’s, which looks like:
CustomerID_num Q1 Q2 Q3 Q4 Q5 Country 0 12346 1 0 0 0 0 United Kingdom 2 12347 0 0 0 0 1 Iceland 9 12348 0 1 0 0 0 Finland 13 12349 0 0 0 0 1 Italy 14 12350 0 1 0 0 0 Norway
What I want to do is adding a new column, Q, to the dataframe which shows which sector this customer is in, so it looks like:
CustomerID_num Q1 Q2 Q3 Q4 Q5 Q Country 0 12346 1 0 0 0 0 1 United Kingdom 2 12347 0 0 0 0 1 5 Iceland 9 12348 0 1 0 0 0 2 Finland 13 12349 0 0 0 0 1 5 Italy 14 12350 0 1 0 0 0 2 Norway
The only way I can think about is using for loop but it will give me a mess. Any other way to do this?
Advertisement
Answer
One option is to dump down into numpy:
Filter for just the Q
columns:
cols = df.filter(like = 'Q')
Get the column positions that are equal to 1:
_, positions = cols.to_numpy().nonzero() df.assign(Q = positions + 1) CustomerID_num Q1 Q2 Q3 Q4 Q5 Country Q 0 12346 1 0 0 0 0 United Kingdom 1 2 12347 0 0 0 0 1 Iceland 5 9 12348 0 1 0 0 0 Finland 2 13 12349 0 0 0 0 1 Italy 5 14 12350 0 1 0 0 0 Norway 2