I am trying to map a column of my df with a dictionary. My dictionary contains tuple as key and I want to update an existing column value based on the key. How can I achieve that ?
sample df
column1 column2 column3 column4 column5 None 123 test 999 42
sample dict
{(123, "test", 999):1}
final df
column1 column2 column3 column4 column5 1 123 test 999 42
Advertisement
Answer
Create Series with MultiIndex
by keys, convert columns in same order like keys by DataFrame.set_index
and then use Series.update
:
d = {(123, "test", 999):1} s = pd.Series(d) s1 = df.set_index(['column2','column3','column4'])['column1'] s1.update(s) df = s1.reset_index().reindex(df.columns, axis=1) print (df) column1 column2 column3 column4 0 1 123 test 999
EDIT: If there is multple columns:
print (df) column1 column2 column3 column4 column5 0 None 123 test 999 42 1 None 123 test 99 42 d = {(123, "test", 999):1} s = pd.Series(d) s1 = df.set_index(['column2','column3','column4'])['column1'] s1.update(s) df['column1'] = s1.reset_index()['column1'] print (df) column1 column2 column3 column4 column5 0 1.0 123 test 999 42 1 None 123 test 99 42
EDIT1: If set name
in Series s
, is possible only set MultiIndex:
d = {(123, "test", 999):1} s = pd.Series(d, name='column1') df1 = df.set_index(['column2','column3','column4']) df1.update(s) df = df1.reset_index().reindex(df.columns, axis=1) print (df) column1 column2 column3 column4 column5 0 1.0 123 test 999 42 1 None 123 test 99 42
This solution overwrite existing values in col1
, if need replace missing values NaN
or None
s use:
#because 7 in first column not overwritten value print (df) column1 column2 column3 column4 0 7 123 test 999 d = {(123, "test", 999):1} s = pd.Series(d) df1 = (df.set_index(['column2','column3','column4'])['column1'] .fillna(s) .reset_index() .reindex(df.columns, axis=1)) print (df1) column1 column2 column3 column4 0 7 123 test 999
First solution set 1
:
s1 = df.set_index(['column2','column3','column4'])['column1'] s1.update(s) df2 = s1.reset_index().reindex(df.columns, axis=1) print (df2) column1 column2 column3 column4 0 1 123 test 999