Skip to content
Advertisement

Is there a way to make new columns from cells and have their values be from another column

I am trying to find a way to take information from one column in a pandas DataFrame and have its unique value be the new column and its score be the value in the newly formed column. I.e.

Index Product Test Score
0 A Protection 5
1 A Comfort 6
2 B Protection 6
3 B Comfort 7

And the end result be something like this:

Index Product Protection Comfort Test_C Test_D
0 A 5 6 2 1
1 B 6 7 3 8

I am trying to do this to clean my data ready for machine learning. Test_C and Test_D were added to show that there are more than just 2 types of tests and it differs depending on the product what test is carried out.

I have tried to do it using the Pandas.get_dummies method but was wondering if there was a cleaner way to do this.

Advertisement

Answer

Use pivot():

df.pivot(index = 'Product', columns = 'Test', values = 'Score')

Returns:

Product Comfort Protection
A       6       5
B       7       6

If you want to have numerical index or keep ‘Product’ as a column instead of index, add reset_index():

df.pivot(index = 'Product', columns = 'Test', values = 'Score').reset_index()

Returns:

    Product Comfort Protection
0   A       6       5
1   B       7       6
User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement