Pandas best way to iterate over rows quickly

Question

I need to compare each value of a list to each value of a df column, and if there is a match take the value of another column. I have a couple of loops working with iterrows but the code is taking a long time to run. I was wondering if there is a more efficient way to do this?

Accepted Answer

Pandas is built to apply operations across a group of data. iterrows is a relatively slow process to use when a group operation isn&#8217;t available. In your case, isin will select the rows you want, and then you can grab the other column.This can be written asimport pandas as pddf = pd.DataFrame({"other_view":[1,2,3,4,5],     "other_column":["a", "b", "c", "d", "e"]})joined_views = [1, 4, 100, 900, 1000]listy = df[df.other_view.isin(joined_viewss)].other_columnprint(listy)or, if you prefer to name the columns as stringsdf[df["other_view"].isin(joined_views)]["other_column"]In words, select df rows where other_view is in joined_views, then take the other_column values.

Advertisement

Answer