Quite newbie to this !
i have a dataframe that looks like this:
currentMilestone            m2          SLA_M6          latedeliverydate            SLA_M3          earlypickupdate
            m2      2020-02-21      2020-02-18              2020-03-14          2020-02-09              2020-02-08
            m2      2020-02-21      2020-02-18              2020-02-14          2020-02-09              2020-02-08
            m2      2020-02-21      2020-02-18              2020-02-14          2020-02-09              2020-02-08
            m2      2020-02-21      2020-02-18              2020-02-14          2020-02-09              2020-02-08
            m1             NaT      2020-03-24              2020-02-14          2020-03-13              2020-03-18
i have written that looks like this:
def flag(data):
    while data.currentMilestone== 'm1'is True:
        if data.SLA_M6  > data.latedeliverydate:
            return 'R'
        elif (data.SLA_M3 != data.earlypickupdate) & (data.latedeliverydate <= data.SLA_M6):
            return 'A'
        elif (data.SLA_M3 == data.earlypickupdate) & (data.latedeliverydate >= data.earlypickupdate):
            return 'G'
        else:
            return None
the expected output is :
currentMilestone            m2          SLA_M6          latedeliverydate            SLA_M3          earlypickupdate         flag
            m2      2020-02-21      2020-02-18              2020-03-14          2020-02-09              2020-02-08          None    
            m2      2020-02-21      2020-02-18              2020-02-14          2020-02-09              2020-02-08          None
            m2      2020-02-21      2020-02-18              2020-02-14          2020-02-09              2020-02-08          None
            m2      2020-02-21      2020-02-18              2020-02-14          2020-02-09              2020-02-08          None
            m1             NaT      2020-03-24              2020-02-14          2020-03-13              2020-03-18            R
When i run my function i dont have any result … I mean the flag is not working properly. All rows are set to None
What wrong here ?
Advertisement
Answer
Use numpy.select for this since apply is very inefficient:
import numpy as np
cond1 = data['currentMilestone'] == 'm1'
condlist = [
    (data['SLA_M6'] > data['latedeliverydate']) & cond1,
    (data['SLA_M3'] != data['earlypickupdate']) & (data['latedeliverydate'] <= data['SLA_M6']) & cond1,
    (data['SLA_M3'] == data['earlypickupdate']) & (data['latedeliverydate'] >= data['earlypickupdate']) & cond1
]
choicelist = ['R', 'A', 'G']
data['flag'] = np.select(condlist, choicelist, default=None)
[out]
currentMilestone m2 SLA_M6 latedeliverydate SLA_M3 earlypickupdate flag 0 m2 2020-02-21 2020-02-18 2020-03-14 2020-02-09 2020-02-08 None 1 m2 2020-02-21 2020-02-18 2020-02-14 2020-02-09 2020-02-08 None 2 m2 2020-02-21 2020-02-18 2020-02-14 2020-02-09 2020-02-08 None 3 m2 2020-02-21 2020-02-18 2020-02-14 2020-02-09 2020-02-08 None 4 m1 NaT 2020-03-24 2020-02-14 2020-03-13 2020-03-18 R
