Skip to content
Advertisement

pandas group by and fill in the missing time interval sequence

I have a data frame like as shown below

JavaScript

What I would like to do is

a) FIll in the missing time by generating a sequence number (ex:1,2,3,4) and copy the value (for all other columns) from the previous row

I was trying something like below

JavaScript

But this doesn’t help me get the expected output

I expect my output to be like as shown below (SAMPLE OF 1 SUBJECT IS SHOWN BELOW)

enter image description here

Advertisement

Answer

Let’s set the time column as the index of dataframe then groupby the dataframe on person_id then for each group classified by person_id reindex the group to conform its index with the range of values specified in time column, finally concat all the groups to get the desired dataframe:

JavaScript

Alternatively you can first create tuple pairs for each person_id and corresponding range of values specified in time column then reindex the dataframe:

JavaScript

Result (for person_id 11):

JavaScript
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement