I am trying to read some sql data using pandas library and one of the column “customer_date” has values like “0001-11-29 13:00:00 BC”. My query fails with error
ValueError: year 0 is out of range
Please suggest a way to parse such date/timestamps.
Here is my code.
import pandas as pd
from datetime import datetime, timedelta
session = ah.get_boto_session()
db = get_connection() **### Custom method**
pd.set_option('display.max_rows', None)
query = "SELECT customer_date FROM customer"
df = pd.read_sql(query, db, parse_dates=["customer_date"])
print(df.head())
db.dispose()
Error:
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-2-d4e334c3f39c> in <module>()
 22 pd.set_option('display.max_rows', None)
 23 query = "SELECT customer_date FROM customer"
---> 24 df = pd.read_sql(query, db, parse_dates=["customer_date"])
 25 print(df.head())
ValueError: year 0 is out of range
Advertisement
Answer
This is what worked:
import pandas as pd
from datetime import datetime, timedelta
session = ah.get_boto_session()
db = get_connection() **### Custom method**
pd.set_option('display.max_rows', None)
query = "SELECT case when customer_date >= '1970-01-01' then customer_date else '1970-01-01' end AS customer_date FROM customer"
df = pd.read_sql(query, db)
print(df.head())
db.dispose()
