Create pandas series using a dictionary as mapper

Is there a built-in function to create a pandas.Series column using a dictionary as mapper and index levels in the data frame ?

The idea is to create a new column based on values in index levels and a dictionary. For instance:

Let’s suppose the following data frame, where id, name and code and different levels in indexes

df

                  col1    col2
id  name  code  
 0    a    x       7       10
           y       8       11
           z       9       12

 1    b    x       13      16
           y       14      17
           z       15      18

JavaScript
​x
 
df
​
                  col1    col2
id  name  code  
 0    a    x       7       10
           y       8       11
           z       9       12
​
 1    b    x       13      16
           y       14      17
           z       15      18
​

and the following dictionary d = {'a': {'y', 'z'}, 'b': {'x'}}

The output of the new column should look like:

                  col1    col2    new
id  name  code  
 0    a    x       7       10      0
           y       8       11      1
           z       9       12      1

 1    b    x       13      16      1
           y       14      17      0
           z       15      18      0

JavaScript
 
                  col1    col2    new
id  name  code  
 0    a    x       7       10      0
           y       8       11      1
           z       9       12      1
​
 1    b    x       13      16      1
           y       14      17      0
           z       15      18      0
​

As a result of mapping in which new = 1 if code index value was in the dictionary list of values with key name, 0 otherwise.

I was trying to manually make this mapping but I am not sure how to iterate over index levels.

This is my attempt so far:

df['y'] = [1 if i in d[k] else 0 for k, v in d.items() for i
                 in df.index.get_level_values('code')]

JavaScript
 
df['y'] = [1 if i in d[k] else 0 for k, v in d.items() for i
                 in df.index.get_level_values('code')]
​

But I am getting the following error which makes me thing that I am not iterating the index levels properly or as expected in conjunction with the dictionary.

ValueError: Length of values does not match length of index

JavaScript
 
ValueError: Length of values does not match length of index
​

Any suggestion?

Answer

Use this for the new column you need:

df['new'] = [1 if j in d[i] else 0 for (i, j) in zip(df.index.get_level_values('name'), df.index.get_level_values('code'))]

JavaScript
 
df['new'] = [1 if j in d[i] else 0 for (i, j) in zip(df.index.get_level_values('name'), df.index.get_level_values('code'))]
​

Advertisement

Answer