I am trying to make the dates in a Pandas DataFrame all of the same format. Currently I have the DataFrame storing the dates in two formats.
“6/08/2017 2:15:00 AM” & 2016-01-01T00:05:00
The column name which these dates are stored under is INTERVAL_END. As you can see, one of the dates is a string, and the other is a formatted date.
I have tried to format the dates based on what they currently are using this:
df['format'] = 1
df.loc[df.INTERVAL_END.str.contains('/', na=False), 'format'] = 2
df.loc[df.format == 1, 'time'] = pd.to_datetime(df.loc[df.format == 1, 'INTERVAL_END'], format = '%Y-%m-%d %H:%M:%S').dt.strftime('%Y-%m-%d')
df.loc[df.format == 2, 'time'] = pd.to_datetime(df.loc[df.format == 2, 'INTERVAL_END'], format = '%d/%m/%Y %H:%M:%S %p').dt.strftime('%Y-%m-%d')
Any help of this would be great thanks!
Advertisement
Answer
You should be able to just .str.strip('"') the column before converting to_datetime().
Toy example:
df = pd.DataFrame({'INTERVAL_END': ['"6/08/2017 2:15:00 AM"', '2016-01-01T00:05:00']})
#                INTERVAL_END
# 0    "6/08/2017 2:15:00 AM"
# 1       2016-01-01T00:05:00
Then convert the stripped column to_datetime():
df.INTERVAL_END = pd.to_datetime(df.INTERVAL_END.str.strip('"'))
#                INTERVAL_END
# 0 2017-06-08 02:15:00+00:00
# 1 2016-01-01 00:05:00+00:00