Passing datetime64[ns] from pandas’ data frame as an argument to a function

Tags: , , , ,

I’m trying to create an additional column in a data frame to show the number of network days (excluding custom holidays) between two dates. I’m using a function to which I’m trying to pass dates from df‘s columns as arguments, but I can’t make it work.

Below is my code (I’m using two made-up holidays in the given set):

from networkdays import networkdays
import datetime as dt
import numpy as np
import pandas as pd

public_holidays_list = [, 1, 6),, 1, 7)]
public_holidays = set(public_holidays_list)

def working_days(start, end, holidays):
    days = networkdays.Networkdays(start, end, holidays)
    working_days = len(days.networkdays())
    return working_days

The formula itself works fine:

print(working_days(, 1, 4),, 1, 8), public_holidays))


Minimal data frame with the dtypes I’m working on:

d = {'Name': ['A', 'B'], 'Start_Date': [, 1, 4),, 1, 11)], 'End_Date': [, 1, 8),, 1, 15)]}
df = pd.DataFrame(data = d)
df['Start_Date'] = pd.to_datetime(df['Start_Date'])
df['End_Date'] = pd.to_datetime(df['End_Date'])

When I’m trying the below way…

df['Working_Days'] = working_days(df['Start_Date'], df['End_Date'])

…I’m getting an error:

AttributeError: ‘Series’ object has no attribute ‘days’

I’ve also tried to use numpy:

df['Working_Days'] = np.vectorize(working_days)(df['Start_Date'], df['End_Date'])

I got an error as well:

AttributeError: ‘numpy.timedelta64’ object has no attribute ‘days’

Could you point me in the right direction?

EDIT: The correct answer to my problem is @Kris’s last comment.

IMPORTANT! Although the lambda doesn’t return any errors, it takes public_holidays into consideration correctly in 2 scenarios:

A) The elements of public_holidays are of class and df‘s dates are of class object (I got this by removing pd.to_datetime() lines from the code).

B) The public_holidays is of type list (created from an Excel table via public_holidays = df_ph['Date'].tolist()), its elements are of class Timestamp and pd.to_datetime() lines are not removed from the code above (making dates in df datetime64[ns]).


As per my comment, use .apply:

df['Working_Days'] = df.apply(lambda x: working_days(x.Start_Date, x.End_Date, public_holidays), axis=1)

Source: stackoverflow