Skip to content
Advertisement

Is it possible to pass an extra argument to lambda function in pandas read_csv

I am using the read_csv() function from pandas and the option for a lambda date_parser function quit often and I am wondering if it is possible to pass an argument to this labda function.

This is a minimal example where I set the format_string:

import pandas as pd

def date_parser_1(value, format_string='%Y.%m.%d %H:%M:%S'):
    return pd.to_datetime(value, format=format_string)

df = pd.read_csv(file,
             parse_dates=[1], 
             date_parser=date_parser_1 #args('%Y-%m-%d %H:%M:%S')                    
            )
print(df)

I do know, that pandas has a infer_datetime_format flag, but this is question is only looking for a self defined date_parser.

Advertisement

Answer

Welcome to the magic of partial functions.

def outer(outer_arg):
    def inner(inner_arg):
        return outer_arg * inner_arg    
    return inner

fn = outer(5)
print(fn(3))

Basically you define your function inside a function and return that inner function as the result. In this case I call outer(5) which means I now have a function assigned to fn that I can call lots of times, each time it will execute the inner function, but with the outer_arg in the closure.

So in your case:

def dp1_wrapper(format_string):
    def date_parser_1(value):
        return pd.to_datetime(value, format=format_string)
    return date_parser_1


df = pd.read_csv(file,
    parse_dates=[1],
    date_parser=dp1_wrapper('%Y.%m.%d %H:%M:%S')
)

Once you know how this works, there is a shortcut utility:

from functools import partial 

df = pd.read_csv(file,
    parse_dates=[1],
    date_parser=partial(date_parser_1, format='%Y.%m.%d %H:%M:%S')
)
User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement