Input dataframe
JavaScript
x
10
10
1
Date Geo Shipment
2
2020-01-01 USA 1000
3
2020-01-01 BRA 5865
4
2020-01-01 CHN 4789
5
2020-01-02 EU1 6541
6
2020-01-02 EU2 3258
7
..
8
9
dict = {"EU1":["ALA", "BEL", "AND", "AUT"] , "EU2": ["AUT", "BEL", "BGR", "HRV", "CZE"] , "EU3": ["EST", "HRV", "FRA", "DEU"]}
10
How to replace value in column Geo
so that there is a one to many mapping and Shipment
values are duplicated?
Output df
JavaScript
1
18
18
1
Date Geo Shipment
2
2020-01-01 ALA 1000
3
2020-01-01 BEL 1000
4
2020-01-01 AND 1000
5
2020-01-01 AUT 1000
6
..
7
2020-01-01 AUT 5865
8
2020-01-01 BEL 5865
9
2020-01-01 HRV 5865
10
2020-01-01 BGR 5865
11
2020-01-01 CZE 5865
12
..
13
2020-01-01 EST 4789
14
2020-01-01 HRV 4789
15
2020-01-01 FRA 4789
16
2020-01-01 DEU 4789
17
..
18
Advertisement
Answer
Use DataFrame
by cosntructor, then outer join by DataFrame.merge
and reassign column Geo
by DataFrame.pop
:
JavaScript
1
11
11
1
d = {"EU1":["ALA", "BEL", "AND", "AUT"] ,
2
"EU2": ["AUT", "BEL", "BGR", "HRV", "CZE"] ,
3
"EU3": ["EST", "HRV", "FRA", "DEU"]}
4
5
df1 = pd.DataFrame(((k, x) for k, v in d.items() for x in v), columns=['Geo','New'])
6
7
8
df = df.merge(df1, on='Geo', how='outer')
9
df['Geo'] = df.pop('New')
10
df = df.sort_values('Date', ignore_index=True)
11
JavaScript
1
25
25
1
print (df)
2
Date Geo Shipment
3
0 2020-01-01 ALA 1000
4
1 2020-01-01 HRV 4789
5
2 2020-01-01 EST 4789
6
3 2020-01-01 CZE 5865
7
4 2020-01-01 HRV 5865
8
5 2020-01-01 FRA 4789
9
6 2020-01-01 BEL 5865
10
7 2020-01-01 AUT 5865
11
8 2020-01-01 BGR 5865
12
9 2020-01-01 AUT 1000
13
10 2020-01-01 AND 1000
14
11 2020-01-01 BEL 1000
15
12 2020-01-01 DEU 4789
16
13 2020-01-02 AND 6541
17
14 2020-01-02 BEL 6541
18
15 2020-01-02 ALA 6541
19
16 2020-01-02 AUT 3258
20
17 2020-01-02 BEL 3258
21
18 2020-01-02 BGR 3258
22
19 2020-01-02 HRV 3258
23
20 2020-01-02 CZE 3258
24
21 2020-01-02 AUT 6541
25
Solution with new data:
JavaScript
1
23
23
1
d = {"EU1":["ALA", "BEL", "AND", "AUT"] ,
2
"EU2": ["AUT", "BEL", "BGR", "HRV", "CZE"] ,
3
"EU3": ["EST", "HRV", "FRA", "DEU"]}
4
5
df1 = pd.DataFrame(((k, x) for k, v in d.items() for x in v), columns=['Geo','New'])
6
7
df = df.merge(df1, on='Geo', how='left')
8
df['Geo'] = df.pop('New').fillna(df['Geo'])
9
print (df)
10
Date Geo Shipment
11
0 2020-01-01 USA 1000
12
1 2020-01-01 BRA 5865
13
2 2020-01-01 CHN 4789
14
3 2020-01-02 ALA 6541
15
4 2020-01-02 BEL 6541
16
5 2020-01-02 AND 6541
17
6 2020-01-02 AUT 6541
18
7 2020-01-02 AUT 3258
19
8 2020-01-02 BEL 3258
20
9 2020-01-02 BGR 3258
21
10 2020-01-02 HRV 3258
22
11 2020-01-02 CZE 3258
23