Multiple dates in a pandas column

Tags: , ,



I am trying to make the dates in a Pandas DataFrame all of the same format. Currently I have the DataFrame storing the dates in two formats.

“6/08/2017 2:15:00 AM” & 2016-01-01T00:05:00

The column name which these dates are stored under is INTERVAL_END. As you can see, one of the dates is a string, and the other is a formatted date.

I have tried to format the dates based on what they currently are using this:

df['format'] = 1
df.loc[df.INTERVAL_END.str.contains('/', na=False), 'format'] = 2

df.loc[df.format == 1, 'time'] = pd.to_datetime(df.loc[df.format == 1, 'INTERVAL_END'], format = '%Y-%m-%d %H:%M:%S').dt.strftime('%Y-%m-%d')
df.loc[df.format == 2, 'time'] = pd.to_datetime(df.loc[df.format == 2, 'INTERVAL_END'], format = '%d/%m/%Y %H:%M:%S %p').dt.strftime('%Y-%m-%d')

Any help of this would be great thanks!

Answer

You should be able to just .str.strip('"') the column before converting to_datetime().

Toy example:

df = pd.DataFrame({'INTERVAL_END': ['"6/08/2017 2:15:00 AM"', '2016-01-01T00:05:00']})

#                INTERVAL_END
# 0    "6/08/2017 2:15:00 AM"
# 1       2016-01-01T00:05:00

Then convert the stripped column to_datetime():

df.INTERVAL_END = pd.to_datetime(df.INTERVAL_END.str.strip('"'))

#                INTERVAL_END
# 0 2017-06-08 02:15:00+00:00
# 1 2016-01-01 00:05:00+00:00


Source: stackoverflow