I am trying to map a column of my df with a dictionary. My dictionary contains tuple as key and I want to update an existing column value based on the key. How can I achieve that ?
sample df
JavaScript
x
3
1
column1 column2 column3 column4 column5
2
None 123 test 999 42
3
sample dict
JavaScript
1
2
1
{(123, "test", 999):1}
2
final df
JavaScript
1
3
1
column1 column2 column3 column4 column5
2
1 123 test 999 42
3
Advertisement
Answer
Create Series with MultiIndex
by keys, convert columns in same order like keys by DataFrame.set_index
and then use Series.update
:
JavaScript
1
11
11
1
d = {(123, "test", 999):1}
2
s = pd.Series(d)
3
4
s1 = df.set_index(['column2','column3','column4'])['column1']
5
s1.update(s)
6
7
df = s1.reset_index().reindex(df.columns, axis=1)
8
print (df)
9
column1 column2 column3 column4
10
0 1 123 test 999
11
EDIT: If there is multple columns:
JavaScript
1
18
18
1
print (df)
2
column1 column2 column3 column4 column5
3
0 None 123 test 999 42
4
1 None 123 test 99 42
5
6
7
d = {(123, "test", 999):1}
8
s = pd.Series(d)
9
10
s1 = df.set_index(['column2','column3','column4'])['column1']
11
s1.update(s)
12
13
df['column1'] = s1.reset_index()['column1']
14
print (df)
15
column1 column2 column3 column4 column5
16
0 1.0 123 test 999 42
17
1 None 123 test 99 42
18
EDIT1: If set name
in Series s
, is possible only set MultiIndex:
JavaScript
1
12
12
1
d = {(123, "test", 999):1}
2
s = pd.Series(d, name='column1')
3
4
df1 = df.set_index(['column2','column3','column4'])
5
df1.update(s)
6
7
df = df1.reset_index().reindex(df.columns, axis=1)
8
print (df)
9
column1 column2 column3 column4 column5
10
0 1.0 123 test 999 42
11
1 None 123 test 99 42
12
This solution overwrite existing values in col1
, if need replace missing values NaN
or None
s use:
JavaScript
1
16
16
1
#because 7 in first column not overwritten value
2
print (df)
3
column1 column2 column3 column4
4
0 7 123 test 999
5
6
d = {(123, "test", 999):1}
7
s = pd.Series(d)
8
9
df1 = (df.set_index(['column2','column3','column4'])['column1']
10
.fillna(s)
11
.reset_index()
12
.reindex(df.columns, axis=1))
13
print (df1)
14
column1 column2 column3 column4
15
0 7 123 test 999
16
First solution set 1
:
JavaScript
1
8
1
s1 = df.set_index(['column2','column3','column4'])['column1']
2
s1.update(s)
3
4
df2 = s1.reset_index().reindex(df.columns, axis=1)
5
print (df2)
6
column1 column2 column3 column4
7
0 1 123 test 999
8