Skip to content
Advertisement

Add timedelta to a date column above weeks

How would I add 1 year to a column?

I’ve tried using map and apply but I failed miserably.

I also wonder why pl.date() accepts integers while it advertises that it only accepts str or pli.Expr.

A small hack workaround is:

col = pl.col('date').dt
df = df.with_column(pl.when(pl.col(column).is_not_null())
                    .then(pl.date(col.year() + 1, col.month(), col.day()))
                    .otherwise(pl.date(col.year() + 1,col.month(), col.day()))
                    .alias("date"))

but this won’t work for months or days. I can’t just add a number or I’ll get a:

> thread 'thread '<unnamed>' panicked at 'invalid or out-of-range date<unnamed>',
         ' panicked at '/github/home/.cargo/registry/src/github.com-1ecc6299db9ec823/chrono-0.4.19/src/naive/date.rsinvalid or out-of-range date:', 173:/github/home/.cargo/registry/src/github.com-1ecc6299db9ec823/chrono-0.4.19/src/naive/date.rs51
:note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Most likely because day and month cycle while year goes to infinity.

I could also do this:

df = df.with_column(
        pl.when(col.month() == 1)
        .then(pl.date(col.year(), 2, col.day()))
        .when(col.month() == 2)
        .then(pl.date(col.year(), 3, col.day()))
        .when(col.month() == 3)
        .then(pl.date(col.year(), 4, col.day()))
        .when(col.month() == 4)
        .then(pl.date(col.year(), 5, col.day()))
        .when(col.month() == 5)
        .then(pl.date(col.year(), 6, col.day()))
        .when(col.month() == 6)
        .then(pl.date(col.year(), 7, col.day()))
        .when(col.month() == 7)
        .then(pl.date(col.year(), 8, col.day()))
        .when(col.month() == 8)
        .then(pl.date(col.year(), 9, col.day()))
        .when(col.month() == 9)
        .then(pl.date(col.year(), 10, col.day()))
        .when(col.month() == 10)
        .then(pl.date(col.year(), 11, col.day()))
        .when(col.month() == 11)
        .then(pl.date(col.year(), 12, col.day()))
        .otherwise(pl.date(col.year() + 1, 1, 1))
        .alias("valid_from")
    )

Advertisement

Answer

Polars allows to do addition and subtraction with python’s timedelta objects. However above week units things get a bit more complicated as we have to take different days of the month and leap years into account.

For this polars has offset_by under the dt namespace.

(pl.DataFrame({
    "dates": pl.date_range(datetime(2000, 1, 1), datetime(2026, 1, 1), "1y")
}).with_columns([
    pl.col("dates").dt.offset_by("1y").alias("dates_and_1_yr")
]))
shape: (27, 2)
┌─────────────────────┬─────────────────────┐
│ dates               ┆ dates_and_1_yr      │
│ ---                 ┆ ---                 │
│ datetime[ns]        ┆ datetime[ns]        │
╞═════════════════════╪═════════════════════╡
│ 2000-01-01 00:00:00 ┆ 2001-01-01 00:00:00 │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 2001-01-01 00:00:00 ┆ 2002-01-01 00:00:00 │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 2002-01-01 00:00:00 ┆ 2003-01-01 00:00:00 │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 2003-01-01 00:00:00 ┆ 2004-01-01 00:00:00 │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ ...                 ┆ ...                 │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 2023-01-01 00:00:00 ┆ 2024-01-01 00:00:00 │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 2024-01-01 00:00:00 ┆ 2025-01-01 00:00:00 │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 2025-01-01 00:00:00 ┆ 2026-01-01 00:00:00 │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 2026-01-01 00:00:00 ┆ 2027-01-01 00:00:00 │
└─────────────────────┴─────────────────────┘

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement