Skip to content
Advertisement

python – using an index in one series to find values in a separate dataframe with matching index

I have a for loop that is taking a subsample of my original dataset, doing a prediction from a previously fit model, and then i need to match the target value from the original dataframe to the prediction to calculate a different value.

20 lines from original subsample:

JavaScript

code:

JavaScript

so, my “target” needs to find the index of each top_200 entry and then find the resulting entry in the ['product'] from the original subsample.

i am striking out on finding the way to take the index number from the series top_200 and find the corresponding product value from the original dataset.

i feel like i am missing something obvious, but searches like “matching an index from a series to a value in a dataframe” are turning up results for a single dataframe, not a series to a dataframe.

if i were looking up data, i’d use a .query() but i don’t know how to do that with an index to an index?

any input would be greatly appreciated!

:Edit to help clarify (hopefully):

so my series top_200 is predictions from the subsample dataframe. the index from the series should be the same as the index from the subsample dataframe. based on the index for a particular row, i want to look up a value in the product column of the subsample dataframe with the same index number.

so here is an example output for that series:

JavaScript

the rows are 303,203,21,296 and 391. i now want to get the value in the column product from the subsample dataframe for the rows 303,203,21,296 and 391.

Advertisement

Answer

When you apply a condition to a Series the result is a boolean Series.

JavaScript

You can then use that boolean Series to filter the original.

JavaScript

You can obtain the indices of the True values and use that to select from a like indexed Series

JavaScript

JavaScript

the rows are 303,203,21,296 and 391. i now want to get the value in the column product from the subsample dataframe for the rows 303,203,21,296 and 391

In my example, the rows that meet the condition have the indices Index(['c', 'e', 'g', 'i'], dtype='object') and can be used to select the same rows of the 'wye' column.

JavaScript

The indices were obtained by filtering the boolean Series for all the True values and accessing the index attribute of the result.

JavaScript

Pandas User Guide: Selection
Indexing and selecting data

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement