Skip to content
Advertisement

I’m trying to use multiple nested np.where to create a column of a data frame in python ,facing error on the same

Consider a data frame with 97 rows and 44 columns where i have three columns whose names are “Bostwick”,”mu_yield” , so i’m trying to create a new column called “Target” where if the “Bostwick” column values lie between “5.00 and 6.75” else if “mu_yield” column values lie between “89.00 and 90.00” , the “Target” column values should be 0 else it is 1

I tried the below way

bos['Target'] = np.where(((bos['mu_yield'] < 5.000) | (bos['mu_yield'] > 6.750)), 0, 
                         np.where((bos['mu_yield'] < 89.00) | (bos['mu_yield'] > 90.00), 0, 1)))

There were no errors but the entire “Target” column values are 0

Hence i tried the below method

bos['Target'] = np.where((bos['Bostwick'] < 5.000) | (bos['Bostwick'] > 6.750)) or ((bos['mu_yield'] < 89.00) | (bos['mu_yield'] > 90.00), 0, 1)

Here i’m facing the below value error

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
~AppDataLocalTemp/ipykernel_35620/4282921525.py in <module>
----> 1 bos['Target'] = np.where((bos['Bostwick'] < 5.000) | (bos['Bostwick'] > 6.750)) or ((bos['mu_yield'] < 89.00) | (bos['mu_yield'] > 90.00), 0, 1)

~anaconda3libsite-packagespandascoreframe.py in __setitem__(self, key, value)
   3610         else:
   3611             # set column
-> 3612             self._set_item(key, value)
   3613 
   3614     def _setitem_slice(self, key: slice, value):

~anaconda3libsite-packagespandascoreframe.py in _set_item(self, key, value)
   3782         ensure homogeneity.
   3783         """
-> 3784         value = self._sanitize_column(value)
   3785 
   3786         if (

~anaconda3libsite-packagespandascoreframe.py in _sanitize_column(self, value)
   4507 
   4508         if is_list_like(value):
-> 4509             com.require_length_match(value, self.index)
   4510         return sanitize_array(value, self.index, copy=True, allow_2d=True)
   4511 

~anaconda3libsite-packagespandascorecommon.py in require_length_match(data, index)
    529     """
    530     if len(data) != len(index):
--> 531         raise ValueError(
    532             "Length of values "
    533             f"({len(data)}) "

ValueError: Length of values (1) does not match length of index (94)

Requesting someone to help me on the same

Advertisement

Answer

Use | for bitwise OR and in original use & for bitwise AND:

bos['Target'] = np.where(((bos['Bostwick'] > 5.000) & (bos['Bostwick'] < 6.750)) |
                         ((bos['mu_yield'] > 89.00) & (bos['mu_yield'] < 90.00)), 0, 1)

Alternative with Series.between:

bos['Target1'] = np.where(bos['Bostwick'].between(5.000, 6.750, inclusive=False) | 
                          bos['mu_yield'].between(89.000, 90.00, inclusive=False), 0, 1)
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement