Skip to content
Advertisement

Process columns based on column names in another column

I like to select cells for processing by choosing column names contained in a different column. For clarity, input and output are given below. Column ‘a’ contains the column names for setting the value to None for each row. I tried to code as below but keep getting errors.

df1 = pd.DataFrame({'a': ['a1',  'a2',  'a4',  'a1'],
                       'a1': [1,  3,  1,  0],
                       'a2': ['9',  '3',  '1',  '4'],
                       'a3': ['8',  '5',  '4',  '6'],
                       'a4': ['8',  '5',  '3',  '3']})

df2 = df1.apply(lambda x: x['a']=None, axis=1)

Input

    a   a1  a2  a3  a4 
0   a1  1   9   8   8
1   a2  3   3   5   5
2   a3  1   1   4   3
3   a1  0   4   6   3

Output

    a   a1   a2   a3 a4
0   a1  None 9    4  9
1   a2  3    None 5  5
2   a4  1    1    5  None
3   a1  None 4    6  0

Advertisement

Answer

Check with mask and numpy boardcast

out = df1.mask(df1.a.values[:,None]==df1.columns.values,'None')
Out[80]: 
    a    a1    a2 a3    a4
0  a1  None     9  8     8
1  a2     3  None  5     5
2  a4     1     1  4  None
3  a1  None     4  6     3

Or we try

m = np.equal.outer(df1.a.values,df1.columns.values)

out = df1.mask(m,'None')
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement