Consider the following data frames in Python Pandas:
DataframeA
ColA | ColB | ColC |
---|---|---|
1 | dog | 439 |
1 | cat | 932 |
1 | frog | 932 |
2 | dog | 2122 |
2 | cat | 454 |
2 | frog | 773 |
3 | dog | 9223 |
3 | cat | 3012 |
3 | frog | 898 |
DataframeB
ColD | ColE |
---|---|
1 | 101 |
2 | 314 |
3 | 124 |
To note, ColB just repeats it’s string values as ColA iterates upwards. ColC and ColE are random. ColA and ColD correspond. ColD values will never have repeats (like a map).
I want to divide ColC by ColE wherever ColA == ColD and ideally put the resulting value in a new column in DataframeA (or just have it overwrite ColC). The resulting value should be able to have decimals.
How can I do this in Python Pandas?
Advertisement
Answer
you can use .map
if you have multiple keys to join on then merge
would be more useful as demonstrated by @anky
JavaScript
x
13
13
1
df1['ColF'] = df1['ColC'] / df1['ColA'].map(df2.set_index(['ColD'])['ColE'])
2
3
ColA ColB ColC ColF
4
0 1 dog 439 4.346535
5
1 1 cat 932 9.227723
6
2 1 frog 932 9.227723
7
3 2 dog 2122 6.757962
8
4 2 cat 454 1.445860
9
5 2 frog 773 2.461783
10
6 3 dog 9223 74.379032
11
7 3 cat 3012 24.290323
12
8 3 frog 898 7.241935
13