Iterating row-wise over 2 pandas dataframes and passing these vectors as args to function

Question

I'd like to iterate row-wise over 2 identically-shaped dataframes, passing the rows from each as vectors to a function without using loops. Essentially something similar to R's mapply. I've investigated a little and the best that I've seen uses map in a list comprehension, but I'm not doing it correctly. Even if we get this to work, though, it seems

Accepted Answer

You can just do helper_func(df1, df2), and in helper_func: return stats.norm.pdf(x, loc=y, scale=sd_array).prod(axis=1). Be aware that your scale is such, that the values returned are almost always 0. Using scale=100*sd_array in the PDF will at least show some non-zero values.In fact, you don&#8217;t need a dataframe in this example:import numpy as npfrom scipy import statsnp.random.seed(1)data1 = np.random.randn(3,3)data2 = np.random.randn(3,3)sd_array = np.array([0.02, 0.015, 0.2])C = 100  # for demonstration purposesdef helper_func(x, y):    return stats.norm.pdf(x, loc=y, scale=C*sd_array).prod(axis=1)res = helper_func(data1, data2)print(res)yieldsarray([0.0002616 , 0.00068695, 0.00035566])But when using a dataframe instead of data1 or data2, NumPy/Pandas/Scipy are flexible enough to recognize the 2D array of values and use it as such.

Advertisement

Answer