I have this df:
data = {'book': [True, False, False, False, False], 'apple': [False, False, True, False, False], 'cat': [False, False, False, False, True], 'pigeon': [False, True, False, False, False], 'shirt': [False, False, False, True, False]} df = pd.DataFrame(data)
Then I want create a new column, df['category']
that takes in as value, the column’s name whose value is true.
So that df['category']
for each TRUE value column as follows:
book - stationery, apple - fruit, cat - animal, pigeon - bird, shirt - clothes
NO 2 columns have TRUE value in a row.
Expected output:
>>> df book apple cat pigeon shirt category 0 True False False False False stationery 1 False False False True False bird 2 False True False False False fruit 3 False False False False True clothes 4 False False True False False animal
Advertisement
Answer
Simple..use idxmax
along axis=1
to get the name of column having True
value, then map
the name to the corresponding category
d = {'book': 'stationery', 'pigeon': 'bird', 'apple': 'fruit', 'shirt': 'clothes', 'cat': 'animal'} df['category'] = df.idxmax(1).map(d)
book apple cat pigeon shirt category 0 True False False False False stationery 1 False False False True False bird 2 False True False False False fruit 3 False False False False True clothes 4 False False True False False animal