I have a Dataset as below :
JavaScript
x
26
26
1
import pandas as pd
2
from workdays import workday, networkdays
3
path = r'C:UsersuserDocumentsGitHublearningabc1test_labtatlab.xlsx'
4
df = pd.read_excel(path)
5
6
7
8
9
10
start date End date HT D
11
0 2022-02-08 NaT indirect BL
12
1 2022-01-20 NaT direct None
13
2 2022-01-23 NaT direct None
14
3 2022-01-23 NaT direct None
15
4 2022-02-07 NaT direct None
16
5 2022-02-07 NaT direct None
17
6 2022-02-09 NaT direct None
18
7 2022-02-09 NaT direct None
19
8 2022-02-10 NaT direct None
20
9 2022-02-11 2022-02-13 direct None
21
10 2022-02-16 NaT direct None
22
11 2022-02-16 NaT direct None
23
12 2022-02-16 NaT direct None
24
13 2022-01-15 2022-01-21 direct None
25
14 2022-01-17 2022-01-17 direct None
26
I write the code to calculate networkdays for these row have date value in column ‘End Date’ :
JavaScript
1
2
1
df.loc[df['D']=='BL', 'D'] = df.apply(lambda x: networkdays(x['start date'],x['End date']) if not pd.isnull(x['End date']) else x['End date'],axis=1) #if column'D' value = 'BL' then skip its value , just apply for the rest cell in D with criterias ['End date'], ['Start date'] not null
2
however, I got the error below, I don’t know how I got this, could you please help look ?
my expect output like below:
JavaScript
1
17
17
1
start date End date HT D
2
0 2022-02-08 NaT indirect BL
3
1 2022-01-20 NaT direct None
4
2 2022-01-23 NaT direct None
5
3 2022-01-23 NaT direct None
6
4 2022-02-07 NaT direct None
7
5 2022-02-07 NaT direct None
8
6 2022-02-09 NaT direct None
9
7 2022-02-09 NaT direct None
10
8 2022-02-10 NaT direct None
11
9 2022-02-11 2022-02-13 direct 3
12
10 2022-02-16 NaT direct None
13
11 2022-02-16 NaT direct None
14
12 2022-02-16 NaT direct None
15
13 2022-01-15 2022-01-21 direct 5
16
14 2022-01-17 2022-01-17 direct 1
17
Advertisement
Answer
I believe the problem comes from how you call the apply
function.
By default, apply
works on columns [1], but you can change that using the axis
parameter.
Something like this might give you the expected result:
JavaScript
1
8
1
df['days'] = df.apply(
2
lambda x:
3
networkdays(x['start date'], x['End date'])
4
if not pd.isnull(x['End date'])
5
else "can not call"
6
, axis=1 # use axis=1 to work with rows instead of columns
7
)
8