I have a table with a structure that more or less (I’ve simplified it for the question) looks like the following:
| id (P.K.) | creation_ts | some_field |
|---|---|---|
| 1 | 2021-08-19 | foo |
| 2 | 2021-08-18 | foo |
| 3 | 2021-08-17 | foo |
| 4 | NULL | bar |
| 5 | 2021-01-01 | bar |
| 6 | 2021-01-02 | bar |
And I’m trying build a query to show a calculated column that per each row with the same value for "some_field" (erm… grouped by the value of "some_field" if you may?) would add an extra column is_newest showing which row is the newest.
| id (P.K.) | creation_ts | some_field | is_newest |
|---|---|---|---|
| 1 | 2021-08-19 | foo | TRUE |
| 2 | 2021-08-18 | foo | FALSE |
| 3 | 2021-08-17 | foo | FALSE |
| 4 | NULL | bar | FALSE |
| 5 | 2021-01-01 | bar | FALSE |
| 6 | 2021-01-02 | bar | TRUE |
The intention of this is to create a SqlAlchemy Hybrid property so we could quickly query things like “get me the newest record WHERE some_field = 'foo'“
I imagine this must be some kind of CASE statement (at least that’s what I gather from this other S.O. answer, which looks pretty promising) but the best thing I can come with is something like:
@is_newest.expression
def is_newest(cls):
subq = sa.sql.exists().where(
sa.and_(
cls.id != cls.id,
cls.some_field == cls.some_field,
# Dirty trick: If there aren't newer records than this one,
# then this must be the newest
cls.creation_ts > cls.creation_ts,
)
)
return sa.case([(subq, False)], else_=True).label("is_newest")
but no: that seems pretty wrong to me (and it isn’t working, since some unit tests are failing), because it produces a query like…
SELECT table.id, table.creation_ts, table.some_field
FROM table WHERE
CASE WHEN (EXISTS (
SELECT * FROM table WHERE
table.id != table.id
AND table.some_field = table.some_field
AND table.creation_ts > table.creation_ts)
) THEN False
ELSE True END IS true
AND table.some_field = 'foo'
… which doesn’t look right. Though to be honest, I’m not very sure what would “look right” (I’m quite a newbie with Postgres)
Any hint would be appreciated. Thank you in advance
Advertisement
Answer
You may use this window function query as native:
select the_table.*, coalesce(creation_ts = max(creation_ts) over (partition by some_field), false) is_newest from the_table;
or (better) make a view out of it and then use the view instead of the table. Result:
id|creation_ts|some_field|is_newest| --+-----------+----------+---------+ 1|2021-08-19 |foo |true | 2|2021-08-18 |foo |false | 3|2021-08-17 |foo |false | 4| |bar |false | 5|2021-01-01 |bar |false | 6|2021-01-02 |bar |true |
Edit
Added coalesce to is_newest expression.