Skip to content
Advertisement

How to set a numpy array in a pandas data frame cell?

I have a pandas dataframe. I want to fill some of the cells with numpy array but I get the following ValueError.

I wil not fill with zero array in real life. This is the simplified example code to replicate the error

ValueError: could not broadcast input array from shape (10,) into shape (1,)

import pandas as pd
import numpy as np

df = pd.DataFrame(columns=['name1','name2','array1','array2' ])
df = df.append({'name1': 'aaaa','name2': 'bbbb','array1':np.nan,'array2': np.nan}, ignore_index=True)
df = df.append({'name1': 'cccc','name2': 'dddd','array1':np.nan,'array2': np.nan}, ignore_index=True)

df.loc[((df['name1']=='aaaa') & (df['name2']=='bbbb')),'array1']=np.zeros((10,1))

print(df)

Advertisement

Answer

One workaround solution is to use .map() with filtering of cell with .loc as you did, as follows:

This works since .map() works on transforming element-wise and would not try to broadcast array to the whole series.

df.loc[((df['name1']=='aaaa') & (df['name2']=='bbbb')),'array1'] = df.loc[((df['name1']=='aaaa') & (df['name2']=='bbbb')),'array1'].map(lambda x: np.zeros((10,1)))


print(df)

  name1 name2                                                                  array1 array2
0  aaaa  bbbb  [[0.0], [0.0], [0.0], [0.0], [0.0], [0.0], [0.0], [0.0], [0.0], [0.0]]    NaN
1  cccc  dddd                                                                     NaN    NaN


df.applymap(type)        #  to check the data type

           name1          name2                   array1           array2
0  <class 'str'>  <class 'str'>  <class 'numpy.ndarray'>  <class 'float'>
1  <class 'str'>  <class 'str'>          <class 'float'>  <class 'float'>
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement