Slice Dataframe in sub-dataframes when specific string in column is found

Question

Assume I have the dataframe df and I want to slice this in multiple dataframes and store each in a list (list_of_dfs). Each sub-dataframe should only contain the rows "Result". One sub-dataframe starts, when in column "Point" the value "P1" and in column "X_Y" the value "X" is given. I tried this with first finding the indicies of each "P1"

Accepted Answer

sub    = df.query("Step == "Result"")pivots = sub[["Point", "X_Y"]].eq(["P1", "X"]).all(axis=1)out    = [fr for _, fr in sub.groupby(pivots.cumsum())]get the subset of the frame where Step is equal to &#8220;Result&#8221;check in which rows there is &#8220;P1&#8221; and &#8220;X&#8221; sequencethat gives a True/False seriescumulative sum of it determines the group as the &#8220;pivoting&#8221; (turning) points will be True since False == 0 in numeric contextiterating over a GroupBy object yields &#8220;group_label, sub_frame&#8221; pairs, out of which we pull the sub_framesto get>>> out[      Step Point X_Y  Value A  Value B 10  Result    P1   X    70.00    70.00 11  Result    P2   X    68.00    68.00 12  Result    P2   Y    66.75    66.75 13  Result    P3   X    68.08    68.08 14  Result    P3   Y    66.72    66.72,       Step Point X_Y  Value A  Value B 25  Result    P1   X    70.00    70.00 26  Result    P2   X    68.00    68.00 27  Result    P2   Y    66.75    66.75 28  Result    P3   X    68.08    68.08 29  Result    P3   Y    66.72    66.72]where the intermediares were>>> sub      Step Point X_Y  Value A  Value B10  Result    P1   X    70.00    70.0011  Result    P2   X    68.00    68.0012  Result    P2   Y    66.75    66.7513  Result    P3   X    68.08    68.0814  Result    P3   Y    66.72    66.7225  Result    P1   X    70.00    70.0026  Result    P2   X    68.00    68.0027  Result    P2   Y    66.75    66.7528  Result    P3   X    68.08    68.0829  Result    P3   Y    66.72    66.72>>> pivots 10     True11    False12    False13    False14    False25     True26    False27    False28    False29    Falsedtype: bool# groups>>> pivots.cumsum()10    111    112    113    114    125    226    227    228    229    2dtype: int32

Advertisement

Answer