Skip to content
Advertisement

Why are python dates such a mess and what can I do about it?

A common source of errors in my Python codebase are dates.

Specifically, the different implementations of dates and datetimes, and how comparisons are handled between them.

These are the date types in my codebase

JavaScript

You can print them to see:

JavaScript

Is there a canonical date representation in Python? I suppose x7: datetime.date is probably closest…

Also, note comparisons are a nightmare, see here a table of trying to do xi == xj

x1 x2 x3 x4 x5 x6 x7
x1: <class ‘pandas._libs.tslibs.timestamps.Timestamp’> True True ERROR: Only resolutions ‘s’, ‘ms’, ‘us’, ‘ns’ are supported. True False True True
x2: <class ‘datetime.datetime’> True True False True False False False
x3: <class ‘numpy.datetime64’> True False True True False True True
x4: <class ‘numpy.datetime64’> True True True True False False False
x5: <class ‘pendulum.datetime.DateTime’> False False False False True False False
x6: <class ‘pendulum.date.Date’> True True True False False True True
x7: <class ‘datetime.date’> True False True False False True True

Also note it’s not even symmetric:

enter image description here

The pain is that comparisons are even stranger. Here is xi>=xj:

Red represents an ERROR:

enter image description here

As you can imagine, there is an ever growing amount of glue code to keep this under control. Is there any advice on how to handle date & datetime types in Python?

For simplicity:

  • I never need timezone data, everything should always be UTC
  • Sometimes dates are passed around as strings for convenience (eg. parsed from a JSON)
  • I at most need seconds resolution, but 99% of my work uses only dates.

Advertisement

Answer

All listed types can be converted to numpy datetime64. If you don’t need more than seconds resolution, you might set the unit to ‘s’ (optional). Ex:

JavaScript

Since numpy tries to avoid time zones (defaults to UTC), make sure to replace the tzinfo for datetime.datetime and pendulum.datetime, should it be set there.

Now you could put this all in one converter function that is essentially a big switch case. Use with caution on big datasets however, convenience does not come for free most of the time. Ex:

JavaScript
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement