I have the following pandas dataframe:
ID value 0 1 A 1 1 B 2 1 C 3 2 B 4 10 C 5 4 C 6 4 A
I want to make dummy variables for the values in the column ‘value’, for each value in the column ‘ID’. So I want it this:
ID A B C 0 1 1 1 1 1 2 0 1 0 2 10 0 0 1 3 4 1 0 1
How can I do this in python?
Advertisement
Answer
Use crosstab with limit counts to 1 by DataFrame.clip:
df1 = (pd.crosstab(df['ID'], df['value'])
.clip(upper=1)
.reset_index()
.rename_axis(None, axis=1))
print (df1)
ID A B C
0 1 1 1 1
1 2 0 1 0
2 4 1 0 1
3 10 0 0 1