Skip to content
Advertisement

how to change datetime format column that contains a string

I’ve a data frame contains a column (Start Shift) and it has different data types (Datetime/string), What i need is to change the datetime format to be time format only and keep the string without any change, i used the below code to solve this issue but i can’t find a way to apply this change in the data frame as when i trying to load the data frame after this change i found nothing has been changed.

The code that i used:-

df=pd.read_excel(r"C:UsersMahmoud.BaderDesktopFP Attendance V1.6 Apr 22.xlsx","Attendance").fillna("")

for i in df['Start Shift']:
    try:
        if i.isalpha():
            i
    except:
            i.strftime('%H:%M %p')

The Data Frame is:-

Department  Start Shift
Accommodation   Annual
Accommodation   OFF Day
Accommodation   2022-04-01 12:00:00
Accommodation   2022-04-01 09:00:00
Accommodation   2022-04-01 10:00:00
Complaints  OFF Day
Complaints  2022-04-29 07:00:00
Complaints  2022-04-29 08:00:00
Complaints  2022-04-30 07:00:00
Complaints  2022-04-30 08:00:00

enter image description here

The Data Frame that i expected to found:-

Department  Start Shift
Accommodation   Annual
Accommodation   OFF Day
Accommodation   12:00 PM
Accommodation   09:00 AM
Accommodation   10:00 AM
Complaints  OFF Day
Complaints  07:00 AM
Complaints  08:00 AM
Complaints  07:00 AM
Complaints  08:00 AM

Advertisement

Answer

You can try to cast the dates into datetime first:

# cast column 'Start Shift' to datetime
df['Start Shift Formatted'] = pd.to_datetime(df['Start Shift'], format='%Y-%m-%d %H:%M:%S', errors='coerce')
# convert into requested time format
df['Start Shift Formatted'] = df['Start Shift Formatted'].dt.strftime('%I:%M %p')
# overwrite old time format
df['Start Shift'] = df['Start Shift'].mask(df['Start Shift Formatted'].notna(), df['Start Shift Formatted'])
# drop column 'Start Shift Formatted' as it is no longer needed
df.drop('Start Shift Formatted', axis=1, inplace=True)

Update: This can be applied to multiple columns using the following function:

def format_time(x):
    x_temp = pd.to_datetime(x, format='%Y-%m-%d %H:%M:%S', errors='coerce')
    x_temp = x_temp.dt.strftime('%I:%M %p')
    x = x.mask(x_temp.notna(), x_temp)
    
    return x

df.loc[:, ['Start Shift','End Shift','Login Time','Logout time']] = df[
    ['Start Shift','End Shift','Login Time','Logout time']
].apply(format_time)
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement