Skip to content
Advertisement

Update column based on grouped date values

Edited/reposted with correct sample output.

I have a dataframe that looks like the following:

JavaScript

This dataframe is split into groups by ID.

I would like to make an updated combined column based on if df['bool'] == True, but only if df['bool'] == True AND there is another ‘finished’ row in the same group with a LATER (not the same) year.

Sample output:

JavaScript

We are not updating the first group because there is not a finished value in a LATER year, and we are updating the second group because there is a finished value in a later year. Thank you!

Advertisement

Answer

This uses temporary columns, and avoids the apply path which can be generally slow:

JavaScript

This solution assumes that the data is sorted on ID and Year in ascending order

User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement