I have a dataframe that discretize the customers into different Q’s, which looks like:
JavaScript
x
7
1
CustomerID_num Q1 Q2 Q3 Q4 Q5 Country
2
0 12346 1 0 0 0 0 United Kingdom
3
2 12347 0 0 0 0 1 Iceland
4
9 12348 0 1 0 0 0 Finland
5
13 12349 0 0 0 0 1 Italy
6
14 12350 0 1 0 0 0 Norway
7
What I want to do is adding a new column, Q, to the dataframe which shows which sector this customer is in, so it looks like:
JavaScript
1
7
1
CustomerID_num Q1 Q2 Q3 Q4 Q5 Q Country
2
0 12346 1 0 0 0 0 1 United Kingdom
3
2 12347 0 0 0 0 1 5 Iceland
4
9 12348 0 1 0 0 0 2 Finland
5
13 12349 0 0 0 0 1 5 Italy
6
14 12350 0 1 0 0 0 2 Norway
7
The only way I can think about is using for loop but it will give me a mess. Any other way to do this?
Advertisement
Answer
One option is to dump down into numpy:
Filter for just the Q
columns:
JavaScript
1
2
1
cols = df.filter(like = 'Q')
2
Get the column positions that are equal to 1:
JavaScript
1
9
1
_, positions = cols.to_numpy().nonzero()
2
df.assign(Q = positions + 1)
3
CustomerID_num Q1 Q2 Q3 Q4 Q5 Country Q
4
0 12346 1 0 0 0 0 United Kingdom 1
5
2 12347 0 0 0 0 1 Iceland 5
6
9 12348 0 1 0 0 0 Finland 2
7
13 12349 0 0 0 0 1 Italy 5
8
14 12350 0 1 0 0 0 Norway 2
9