Skip to content
Advertisement

Changing column various string formats in pandas

I have been working on a dataframe where one of the column (flight_time) contains flight duration, all of the strings are in 3 different formats for example:

  • 07 h 05 m
  • 13h 55m
  • 2h 23m

I would like to change them all to HH:MM format and finally change the data type from object to time.

Can somebody tell me how to do this?

Advertisement

Answer

It’s not possible to have a time dtype. You can have a datetime64 (pd.DatetimeIndex) or a timedelta64 (pd.TimedeltaIndex). In your case, I think it’s better to have a TimedeltaIndex so you can use the pd.to_timedelta function:

JavaScript

If you want individual time from datetime.time, use:

JavaScript

In this case, flight_time2 has still object dtype:

JavaScript

But each value is an instance of datetime.time:

JavaScript

In the first case, you can use vectorized method while in the second version is not possible. Furthermore, you loose the dt accessor.

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement