Skip to content
Advertisement

keep the same name until value = true in another pandas column

I have a dataframe with 3 columns: session_id, name, reset_flag.

I need to make a new column, new_name, where the new name will be set to the first name where reset_flag=True, and then it will continue as that name WITHIN that session, until there is new reset_flag.

Not really sure best way to approach.

EDIT: I thought of a way to do so with df.iterrows(), by storing into list and then appending, but it seems very bulky. is there a more efficient ‘pandas’ way?

Sample expected output

session_id name reset_flag new_name
06c97a-bc7-6cc-29f-65978ee8d some_name_1 TRUE some_name_1
06c97a-bc7-6cc-29f-65978ee8d some_name_1 some_name_1
06c97a-bc7-6cc-29f-65978ee8d some_name_1 some_name_1
06c97a-bc7-6cc-29f-65978ee8d some_name_2 TRUE some_name_2
06c97a-bc7-6cc-29f-65978ee8d some_name_2 some_name_2
06c97a-bc7-6cc-29f-65978ee8d some_name_2 some_name_2
06c97a-bc7-6cc-29f-65978ee8d some_name_3 some_name_2
06c97a-bc7-6cc-29f-65978ee8d some_name_3 some_name_2
06c97a-bc7-6cc-29f-65978ee8d some_name_4 some_name_2
06c97a-bc7-6cc-29f-65978ee8d some_name_4 some_name_2
06c97a-bc7-6cc-29f-65978ee8d some_name_4 some_name_2
06c97a-bc7-6cc-29f-65978ee8d some_name_5 TRUE some_name_5
3943d5-e1e-63e-6c4-aa1899bd9 some_name_1 TRUE some_name_1
3943d5-e1e-63e-6c4-aa1899bd9 some_name_1 some_name_1
3943d5-e1e-63e-6c4-aa1899bd9 some_name_1 some_name_1
3943d5-e1e-63e-6c4-aa1899bd9 some_name_2 some_name_1
3943d5-e1e-63e-6c4-aa1899bd9 some_name_2 some_name_1
3943d5-e1e-63e-6c4-aa1899bd9 some_name_2 some_name_1
3943d5-e1e-63e-6c4-aa1899bd9 some_name_3 TRUE some_name_3
3943d5-e1e-63e-6c4-aa1899bd9 some_name_3 some_name_3
3943d5-e1e-63e-6c4-aa1899bd9 some_name_4 some_name_3
3943d5-e1e-63e-6c4-aa1899bd9 some_name_4 some_name_3
3943d5-e1e-63e-6c4-aa1899bd9 some_name_4 some_name_3
3943d5-e1e-63e-6c4-aa1899bd9 some_name_5 TRUE some_name_5
3943d5-e1e-63e-6c4-aa1899bd9 some_name_6 some_name_5

Advertisement

Answer

An efficient way to go about this would be to use cumsum on the “reset_flag” column : this will give you a columns of numbers that increase every time a True is encountered.

You can then simply group by this column to get the desired result (I’m assuming your “reset_flag” column is boolean):

df["new_name"] = df.groupby(df["reset_flag"].cumsum())["name"].transform("first")
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement