I tried to convert a column of dates to datetime
using pd.to_datetime(df, format='%Y-%m-%d_%H-%M-%S')
but I received the error ValueError: unconverted data remains: .1
I ran:
JavaScript
x
2
1
data.loc[pd.to_datetime(data.date, format='%Y-%m-%d_%H-%M-%S', errors='coerce').isnull(), 'date']
2
to identify the problem. 119/1037808 dates in the date
column have an extra “.1” at the end of them. Other than the “.1”, the dates are fine. How can I remove the “.1” from the end of those dates only and then convert the column values to datetime?
Here is an example dataframe that recreates the issue:
JavaScript
1
5
1
import pandas as pd
2
3
data = pd.DataFrame({"date" : ["2022-01-15_08-11-00.1","2022-01-15_08-11-30","2022-01-15_08-12-00.1", "2022-01-15_08-12-30"],
4
"value" : [1,2,3,4]})
5
I have tried:
JavaScript
1
2
1
data.date = data.date.replace(".1", "")
2
and
JavaScript
1
2
1
data = data.replace(".1", "")
2
but these did not remove the “.1”. The final result should look like this:
JavaScript
1
3
1
data = pd.DataFrame({"date" : ["2022-01-15_08-11-00","2022-01-15_08-11-30","2022-01-15_08-12-00", "2022-01-15_08-12-30"],
2
"value" : [1,2,3,4]})
3
Advertisement
Answer
You can use pandas.Series.replace
to get rid of the extra dot/number :
JavaScript
1
4
1
data["date"]= pd.to_datetime(data["date"].replace(r".d+", "",
2
regex=True),
3
format="%Y-%m-%d_%H-%M-%S")
4
# Output :
JavaScript
1
12
12
1
print(data)
2
print(data.dtypes)
3
4
date value
5
0 2022-01-15 08:11:00 1
6
1 2022-01-15 08:11:30 2
7
2 2022-01-15 08:12:00 3
8
3 2022-01-15 08:12:30 4
9
date datetime64[ns]
10
value int64
11
dtype: object
12
If you don’t want a datetime format, use just data["date"].replace(r".d+", "", regex=True)