Skip to content
Advertisement

How to use a polars column with offset string to add to another date column

Suppose you have

df=pl.DataFrame(
{
    "date":["2022-01-01", "2022-01-02"],
    "hroff":[5,2],
    "minoff":[1,2]
 }).with_column(pl.col('date').str.strptime(pl.Date,"%Y-%m-%d"))

and you want to make a new column that adds the hour and min offsets to the date column. The only thing I saw was the dt.offset_by method. I made an extra column

df=df.with_column((pl.col('hroff')+"h"+pl.col('minoff')+"m").alias('offset'))

and then tried

df.with_column(pl.col('date') 
               .cast(pl.Datetime).dt.with_time_zone('UTC') 
               .dt.offset_by(pl.col('offset')).alias('newdate'))

but that doesn’t work because dt.offset_by only takes a fixed string, not another column.

What’s the best way to do that?

Advertisement

Answer

Use pl.duration:

import polars as pl

df = pl.DataFrame({
    "date": pl.Series(["2022-01-01", "2022-01-02"]).str.strptime(pl.Datetime(time_zone="UTC"), "%Y-%m-%d"),
    "hroff": [5, 2],
    "minoff": [1, 2]
})

print(df.select(
    pl.col("date") + pl.duration(hours=pl.col("hroff"), minutes=pl.col("minoff"))
))
shape: (2, 1)
┌─────────────────────┐
│ date                │
│ ---                 │
│ datetime[μs]        │
╞═════════════════════╡
│ 2022-01-01 05:01:00 │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 2022-01-02 02:02:00 │
└─────────────────────┘
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement