Skip to content
Advertisement

Iterating through a column and mapping values

Here is what I am trying to do. I want to substitute the values of this data frame. enter image description here

For example. Bernard to be substituted as 1, and then Drake as 2 and so on and so forth. How to iterate through the column to write a function that can do the following.

Advertisement

Answer

The function already exists – pd.factorize.

It returns a tuple – first a new column with the values each item has been mapped to. Then second an index of the unique values.

df = pd.DataFrame({'name': ['Bernard', 'Bernard', 'Drake', 'Drake', 'Lance']})
pd.factorize(df.name)
(array([0, 0, 1, 1, 2]), Index(['Bernard', 'Drake', 'Lance'], dtype='object'))

Using that, we’d just assign a new column:

df = df.assign(codes=pd.factorize(df.name)[0] + 1)
df
      name  codes
0  Bernard      1
1  Bernard      1
2    Drake      2
3    Drake      2
4    Lance      3
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement