Skip to content
Advertisement

CET timezone strings to datetime

date_cet                           col1
---------------------------------------
2021-10-31 02:00:00+02:00           7.0
2021-10-31 02:00:00+02:00           7.0
2021-10-31 02:00:00+02:00           8.0
2021-10-31 02:00:00+01:00          10.0
2021-10-31 02:00:00+01:00          11.0

I have a data frame that has columns looking similar to this. This data is imported from SQL into a Pandas data frame, and when I print out the dtypes I can see that the date_cet column is object. Since I need it further on, I want to convert it to a datetime object. However, the stuff I’ve tried just doesn’t work, and I think it might have something to do with 1) the timezone difference and 2) the fact that this date is where DST changes (i.e. the +01:00 and +02:00).

I’ve tried to do stuff like this:

import datetime as dt

df["new_date"] = [dt.datetime.strptime(str(x), "%Y-%m-%d %H:%M:%S %z") for x in df["date_cet"]]

df['new_date']= pd.to_datetime(df['date_cet'])

and a hand full of other stuff.

The first gives an error of:

ValueError: time data '2021-10-31 02:00:00+02:00' does not match format '%Y-%m-%d %H:%M:%S %z'

And the last:

ValueError: Tz-aware datetime.datetime cannot be converted to datetime64 unless utc=True

Basically, I have no idea how to fix this. I just need this column to become a datetime[ns, Europe/Copenhagen] type of column, but everything I’ve done so far doesn’t work.

Advertisement

Answer

In the datetime string (‘2021-10-31 02:00:00+02:00’) there is no space between %S and %z

try to change to this format – "%Y-%m-%d %H:%M:%S%z"

import datetime as dt

df["new_date"] = [dt.datetime.strptime(str(x), "%Y-%m-%d %H:%M:%S%z") for x in df["date_cet"]]

df['new_date']= pd.to_datetime(df['date_cet'])

Update:

to fix the error try adding – utc=True:

import datetime as dt

df["new_date"] = [dt.datetime.strptime(str(x), "%Y-%m-%d %H:%M:%S%z") for x in df["date_cet"]]

df['new_date']= pd.to_datetime(df['date_cet'], utc=True)

you can do this by one line:

df['new_date']= pd.to_datetime(df['date_cet'], format="%Y-%m-%d %H:%M:%S%z", utc=True)
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement