I have a dataframe of shape (4, 3) as following:
In [1]: import pandas as pd In [2]: import numpy as np In [3]: x = pd.DataFrame(np.random.randn(4, 3), index=np.arange(4)) In [4]: x Out[4]: 0 1 2 0 0.959322 0.099360 1.116337 1 -0.211405 -2.563658 -0.561851 2 0.616312 -1.643927 -0.483673 3 0.235971 0.023823 1.146727
I want to multiply each column of the dataframe with a numpy array of shape (4,):
In [9]: y = np.random.randn(4) In [10]: y Out[10]: array([-0.34125522, 1.21567883, -0.12909408, 0.64727577])
In numpy, the following broadcasting trick works:
In [12]: x.values * y[:, None] Out[12]: array([[-0.32737369, -0.03390716, -0.38095588], [-0.25700028, -3.11658448, -0.68303043], [-0.07956223, 0.21222123, 0.06243928], [ 0.15273815, 0.01541983, 0.74224861]])
However, it doesn’t work in the case of pandas dataframe, I get the following error:
In [13]: x * y[:, None] --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-13-21d033742c49> in <module>() ----> 1 x * y[:, None] ... ValueError: Shape of passed values is (1, 4), indices imply (3, 4)
Any suggestions?
Advertisement
Answer
I find an alternative way to do the multiplication between pandas dataframe and numpy array.
In [14]: x.multiply(y, axis=0) Out[14]: 0 1 2 0 0.195346 0.443061 1.219465 1 0.194664 0.242829 0.180010 2 0.803349 0.091412 0.098843 3 0.365711 -0.388115 0.018941