python – using an index in one series to find values in a separate dataframe with matching index

Question

I have a for loop that is taking a subsample of my original dataset, doing a prediction from a previously fit model, and then i need to match the target value from the original dataframe to the prediction to calculate a different value. 20 lines from original subsample: code: so, my "target" needs to find the index of each top_200

Accepted Answer

When you apply a condition to a Series the result is a boolean Series.>>> s = pd.Series(range(10))>>> s0    01    12    23    34    45    56    67    78    89    9dtype: int64>>> q = s % 2 == 0>>> q0     True1    False2     True3    False4     True5    False6     True7    False8     True9    Falsedtype: boolYou can then use that boolean Series to filter the original.>>> s[q]0    02    24    46    68    8dtype: int64You can obtain the indices of the True values and use that to select from a like indexed Series>>> q[q].indexInt64Index([0, 2, 4, 6, 8], dtype='int64')>>> indices = q[q].index>>> s[indices]0    02    24    46    68    8dtype: int64>>> df = pd.DataFrame({'ex':range(10),'wye':list('zyxwvutsrq')}, index=list('abcdefghij'))>>> df   ex wyea   0   zb   1   yc   2   xd   3   we   4   vf   5   ug   6   th   7   si   8   rj   9   q>>> m = df.ex.isin([2,4,6,8])>>> ma    Falseb    Falsec     Trued    Falsee     Truef    Falseg     Trueh    Falsei     Truej    FalseName: ex, dtype: bool>>> df.loc[m,'wye']c    xe    vg    ti    rName: wye, dtype: object>>> m[m].indexIndex(['c', 'e', 'g', 'i'], dtype='object')>>> r = m[m].index>>> df.loc[r,:]   ex wyec   2   xe   4   vg   6   ti   8   rthe rows are 303,203,21,296 and 391. i now want to get the value in the column product from the subsample dataframe for the rows 303,203,21,296 and 391In my example, the rows that meet the condition have the indices Index(['c', 'e', 'g', 'i'], dtype='object') and can be used to select the same rows of the 'wye' column.>>> df.loc[r,'wye']c    xe    vg    ti    rName: wye, dtype: objectThe indices were obtained by filtering the boolean Series for all the True values and accessing the index attribute of the result.>>> m[m].indexIndex(['c', 'e', 'g', 'i'], dtype='object')>>> r = m[m].indexPandas User Guide:SelectionIndexing and selecting data

Advertisement

Answer