Consider the following data frames in Python Pandas:
DataframeA
ColA | ColB | ColC |
---|---|---|
1 | dog | 439 |
1 | cat | 932 |
1 | frog | 932 |
2 | dog | 2122 |
2 | cat | 454 |
2 | frog | 773 |
3 | dog | 9223 |
3 | cat | 3012 |
3 | frog | 898 |
DataframeB
ColD | ColE |
---|---|
1 | 101 |
2 | 314 |
3 | 124 |
To note, ColB just repeats it’s string values as ColA iterates upwards. ColC and ColE are random. ColA and ColD correspond. ColD values will never have repeats (like a map).
I want to divide ColC by ColE wherever ColA == ColD and ideally put the resulting value in a new column in DataframeA (or just have it overwrite ColC). The resulting value should be able to have decimals.
How can I do this in Python Pandas?
Advertisement
Answer
you can use .map
if you have multiple keys to join on then merge
would be more useful as demonstrated by @anky
df1['ColF'] = df1['ColC'] / df1['ColA'].map(df2.set_index(['ColD'])['ColE']) ColA ColB ColC ColF 0 1 dog 439 4.346535 1 1 cat 932 9.227723 2 1 frog 932 9.227723 3 2 dog 2122 6.757962 4 2 cat 454 1.445860 5 2 frog 773 2.461783 6 3 dog 9223 74.379032 7 3 cat 3012 24.290323 8 3 frog 898 7.241935