Suppose you have
df=pl.DataFrame( { "date":["2022-01-01", "2022-01-02"], "hroff":[5,2], "minoff":[1,2] }).with_column(pl.col('date').str.strptime(pl.Date,"%Y-%m-%d"))
and you want to make a new column that adds the hour and min offsets to the date column. The only thing I saw was the dt.offset_by
method. I made an extra column
df=df.with_column((pl.col('hroff')+"h"+pl.col('minoff')+"m").alias('offset'))
and then tried
df.with_column(pl.col('date') .cast(pl.Datetime).dt.with_time_zone('UTC') .dt.offset_by(pl.col('offset')).alias('newdate'))
but that doesn’t work because dt.offset_by
only takes a fixed string, not another column.
What’s the best way to do that?
Advertisement
Answer
Use pl.duration
:
import polars as pl df = pl.DataFrame({ "date": pl.Series(["2022-01-01", "2022-01-02"]).str.strptime(pl.Datetime(time_zone="UTC"), "%Y-%m-%d"), "hroff": [5, 2], "minoff": [1, 2] }) print(df.select( pl.col("date") + pl.duration(hours=pl.col("hroff"), minutes=pl.col("minoff")) ))
shape: (2, 1) ┌─────────────────────┐ │ date │ │ --- │ │ datetime[μs] │ ╞═════════════════════╡ │ 2022-01-01 05:01:00 │ ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤ │ 2022-01-02 02:02:00 │ └─────────────────────┘