I am trying to read some sql data using pandas library and one of the column “customer_date” has values like “0001-11-29 13:00:00 BC”. My query fails with error
ValueError: year 0 is out of range
Please suggest a way to parse such date/timestamps.
Here is my code.
import pandas as pd from datetime import datetime, timedelta session = ah.get_boto_session() db = get_connection() **### Custom method** pd.set_option('display.max_rows', None) query = "SELECT customer_date FROM customer" df = pd.read_sql(query, db, parse_dates=["customer_date"]) print(df.head()) db.dispose()
Error:
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-2-d4e334c3f39c> in <module>() 22 pd.set_option('display.max_rows', None) 23 query = "SELECT customer_date FROM customer" ---> 24 df = pd.read_sql(query, db, parse_dates=["customer_date"]) 25 print(df.head()) ValueError: year 0 is out of range
Advertisement
Answer
This is what worked:
import pandas as pd from datetime import datetime, timedelta session = ah.get_boto_session() db = get_connection() **### Custom method** pd.set_option('display.max_rows', None) query = "SELECT case when customer_date >= '1970-01-01' then customer_date else '1970-01-01' end AS customer_date FROM customer" df = pd.read_sql(query, db) print(df.head()) db.dispose()