I am trying to make the dates in a Pandas DataFrame all of the same format. Currently I have the DataFrame storing the dates in two formats.
“6/08/2017 2:15:00 AM” & 2016-01-01T00:05:00
The column name which these dates are stored under is INTERVAL_END. As you can see, one of the dates is a string, and the other is a formatted date.
I have tried to format the dates based on what they currently are using this:
df['format'] = 1 df.loc[df.INTERVAL_END.str.contains('/', na=False), 'format'] = 2 df.loc[df.format == 1, 'time'] = pd.to_datetime(df.loc[df.format == 1, 'INTERVAL_END'], format = '%Y-%m-%d %H:%M:%S').dt.strftime('%Y-%m-%d') df.loc[df.format == 2, 'time'] = pd.to_datetime(df.loc[df.format == 2, 'INTERVAL_END'], format = '%d/%m/%Y %H:%M:%S %p').dt.strftime('%Y-%m-%d')
Any help of this would be great thanks!
Advertisement
Answer
You should be able to just .str.strip('"')
the column before converting to_datetime()
.
Toy example:
df = pd.DataFrame({'INTERVAL_END': ['"6/08/2017 2:15:00 AM"', '2016-01-01T00:05:00']}) # INTERVAL_END # 0 "6/08/2017 2:15:00 AM" # 1 2016-01-01T00:05:00
Then convert the stripped column to_datetime()
:
df.INTERVAL_END = pd.to_datetime(df.INTERVAL_END.str.strip('"')) # INTERVAL_END # 0 2017-06-08 02:15:00+00:00 # 1 2016-01-01 00:05:00+00:00