In the DataFrame
below
df = pd.DataFrame([('Ve_Paper', 'Buy', '-','Canada',np.NaN), ('Ve_Gasoline', 'Sell', 'Done','Britain',np.NaN), ('Ve_Water', 'Sell','-','Canada,np.NaN), ('Ve_Plant', 'Buy', 'Good','China',np.NaN), ('Ve_Soda', 'Sell', 'Process','Germany',np.NaN)], columns=['Name', 'Action','Status','Country','Value'])
I am trying to update the Value
column based on the following conditions, if Action
is Sell
check if the Status
is not -
if both the conditions are true then the first two characters of the Country
needs to be updated as the Value
column else if Status
column is -
and the Action column is Sell
the Value
column needs to be updated with the Name
column without the characters Ve_
, if Action
is not Sell
leave the Value
column as np.NaN
But the output
I am expecting is
Name Action Status Country Value Ve_Paper Buy - Canada np.NaN # Because Action is not Sell Ve_Gasoline Sell Done Britain Br # The first two characters of Country Since Action is sell and Status is not "-" Ve_Water Sell - Canada Water # The Name value without 'Ve_' since Action is Sell and the Status is '-' Ve_Plant Buy Good China np.NaN Ve_Soda Sell Process Germany Ge
I have tried with np.where
and df.loc
both didn’t work. Please do help me because I am out of options now
What I have tried so far is
import numpy as np df['Value'] = np.where(df['Action']== 'Sell',df['Country'].str[:2] if df['Status'].str != '-' else df['Name'].str[3:],df['Value'])
but I am getting the output as <pandas.core.strings.StringMethods object at 0x000001EDB8F662B0>
wherever Iam trying to extract substrings
so the output looks like this
Name Action Status Country Value Ve_Paper Buy - Canada np.NaN Ve_Gasoline Sell Done Britain <pandas.core.strings.StringMethods object at 662B0> Ve_Water Sell - Canada <pandas.core.strings.StringMethods object at 0x000001EDB8F662B0> Ve_Plant Buy Good China np.NaN Ve_Soda Sell Process Germany <pandas.core.strings.StringMethods object at 0x000001EDB8F662B0>
Advertisement
Answer
You have two conditions ,we can do it with np.select
conda = df.Action.eq('Sell') condb = df.Status.eq('-') df['value'] = np.select([conda&condb, conda&~condb], [df.Name.str.split('_').str[1],df.Country.str[:2]], default = np.nan) df Out[343]: Name Action Status Country Value value 0 Ve_Paper Buy - Canada NaN NaN 1 Ve_Gasoline Sell Done Britain NaN Br 2 Ve_Water Sell - Canada NaN Water 3 Ve_Plant Buy Good China NaN NaN 4 Ve_Soda Sell Process Germany NaN Ge