Consider the following data frames in Python Pandas:
DataframeA
| ColA | ColB | ColC | 
|---|---|---|
| 1 | dog | 439 | 
| 1 | cat | 932 | 
| 1 | frog | 932 | 
| 2 | dog | 2122 | 
| 2 | cat | 454 | 
| 2 | frog | 773 | 
| 3 | dog | 9223 | 
| 3 | cat | 3012 | 
| 3 | frog | 898 | 
DataframeB
| ColD | ColE | 
|---|---|
| 1 | 101 | 
| 2 | 314 | 
| 3 | 124 | 
To note, ColB just repeats it’s string values as ColA iterates upwards. ColC and ColE are random. ColA and ColD correspond. ColD values will never have repeats (like a map).
I want to divide ColC by ColE wherever ColA == ColD and ideally put the resulting value in a new column in DataframeA (or just have it overwrite ColC). The resulting value should be able to have decimals.
How can I do this in Python Pandas?
Advertisement
Answer
you can use .map
if you have multiple keys to join on then merge would be more useful as demonstrated by @anky
df1['ColF'] = df1['ColC'] / df1['ColA'].map(df2.set_index(['ColD'])['ColE']) ColA ColB ColC ColF 0 1 dog 439 4.346535 1 1 cat 932 9.227723 2 1 frog 932 9.227723 3 2 dog 2122 6.757962 4 2 cat 454 1.445860 5 2 frog 773 2.461783 6 3 dog 9223 74.379032 7 3 cat 3012 24.290323 8 3 frog 898 7.241935
