Consider a data frame with 97 rows and 44 columns where i have three columns whose names are “Bostwick”,”mu_yield” , so i’m trying to create a new column called “Target” where if the “Bostwick” column values lie between “5.00 and 6.75” else if “mu_yield” column values lie between “89.00 and 90.00” , the “Target” column values should be 0 else it is 1
I tried the below way
bos['Target'] = np.where(((bos['mu_yield'] < 5.000) | (bos['mu_yield'] > 6.750)), 0, np.where((bos['mu_yield'] < 89.00) | (bos['mu_yield'] > 90.00), 0, 1)))
There were no errors but the entire “Target” column values are 0
Hence i tried the below method
bos['Target'] = np.where((bos['Bostwick'] < 5.000) | (bos['Bostwick'] > 6.750)) or ((bos['mu_yield'] < 89.00) | (bos['mu_yield'] > 90.00), 0, 1)
Here i’m facing the below value error
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) ~AppDataLocalTemp/ipykernel_35620/4282921525.py in <module> ----> 1 bos['Target'] = np.where((bos['Bostwick'] < 5.000) | (bos['Bostwick'] > 6.750)) or ((bos['mu_yield'] < 89.00) | (bos['mu_yield'] > 90.00), 0, 1) ~anaconda3libsite-packagespandascoreframe.py in __setitem__(self, key, value) 3610 else: 3611 # set column -> 3612 self._set_item(key, value) 3613 3614 def _setitem_slice(self, key: slice, value): ~anaconda3libsite-packagespandascoreframe.py in _set_item(self, key, value) 3782 ensure homogeneity. 3783 """ -> 3784 value = self._sanitize_column(value) 3785 3786 if ( ~anaconda3libsite-packagespandascoreframe.py in _sanitize_column(self, value) 4507 4508 if is_list_like(value): -> 4509 com.require_length_match(value, self.index) 4510 return sanitize_array(value, self.index, copy=True, allow_2d=True) 4511 ~anaconda3libsite-packagespandascorecommon.py in require_length_match(data, index) 529 """ 530 if len(data) != len(index): --> 531 raise ValueError( 532 "Length of values " 533 f"({len(data)}) " ValueError: Length of values (1) does not match length of index (94)
Requesting someone to help me on the same
Advertisement
Answer
Use |
for bitwise OR and in original use &
for bitwise AND
:
bos['Target'] = np.where(((bos['Bostwick'] > 5.000) & (bos['Bostwick'] < 6.750)) | ((bos['mu_yield'] > 89.00) & (bos['mu_yield'] < 90.00)), 0, 1)
Alternative with Series.between
:
bos['Target1'] = np.where(bos['Bostwick'].between(5.000, 6.750, inclusive=False) | bos['mu_yield'].between(89.000, 90.00, inclusive=False), 0, 1)