I have a Dataset as below :
import pandas as pd from workdays import workday, networkdays path = r'C:UsersuserDocumentsGitHublearningabc1test_labtatlab.xlsx' df = pd.read_excel(path) start date End date HT D 0 2022-02-08 NaT indirect BL 1 2022-01-20 NaT direct None 2 2022-01-23 NaT direct None 3 2022-01-23 NaT direct None 4 2022-02-07 NaT direct None 5 2022-02-07 NaT direct None 6 2022-02-09 NaT direct None 7 2022-02-09 NaT direct None 8 2022-02-10 NaT direct None 9 2022-02-11 2022-02-13 direct None 10 2022-02-16 NaT direct None 11 2022-02-16 NaT direct None 12 2022-02-16 NaT direct None 13 2022-01-15 2022-01-21 direct None 14 2022-01-17 2022-01-17 direct None
I write the code to calculate networkdays for these row have date value in column ‘End Date’ :
df.loc[df['D']=='BL', 'D'] = df.apply(lambda x: networkdays(x['start date'],x['End date']) if not pd.isnull(x['End date']) else x['End date'],axis=1) #if column'D' value = 'BL' then skip its value , just apply for the rest cell in D with criterias ['End date'], ['Start date'] not null
however, I got the error below, I don’t know how I got this, could you please help look ?
my expect output like below:
start date End date HT D 0 2022-02-08 NaT indirect BL 1 2022-01-20 NaT direct None 2 2022-01-23 NaT direct None 3 2022-01-23 NaT direct None 4 2022-02-07 NaT direct None 5 2022-02-07 NaT direct None 6 2022-02-09 NaT direct None 7 2022-02-09 NaT direct None 8 2022-02-10 NaT direct None 9 2022-02-11 2022-02-13 direct 3 10 2022-02-16 NaT direct None 11 2022-02-16 NaT direct None 12 2022-02-16 NaT direct None 13 2022-01-15 2022-01-21 direct 5 14 2022-01-17 2022-01-17 direct 1
Advertisement
Answer
I believe the problem comes from how you call the apply
function.
By default, apply
works on columns [1], but you can change that using the axis
parameter.
Something like this might give you the expected result:
df['days'] = df.apply( lambda x: networkdays(x['start date'], x['End date']) if not pd.isnull(x['End date']) else "can not call" , axis=1 # use axis=1 to work with rows instead of columns )