I have 2 pandas index instances that come from different functions / a bit of a complicated mask to get to those. I would now like to combine those, i.e., define a ‘combined index’ that holds all labels contained in either of the two. My friend pd.concat() cannot be applied to 2 index instances. What’s the best way to combine them? Here is ‘kind of’ what I would like to do, but which fails
import pandas as pd df = pd.DataFrame([[1, 2], [3, 4], [5, 6]], columns=['foo', 'bar']) setA = df[df['foo'] == 1].index setB = df[df['bar'] == 4].index pd.concat([setA, setB])
What I would like to get is the equivalent of the below, however obviously without combining the masking into a single function: Rather keep retrieving the 2 index instances setA
and setB
separately, as I have them already, and combine the output, not their definition.
df[(df['foo'] == 1) | (df['bar'] == 4)].index Int64Index([0, 1], dtype='int64')
Is the only way really to get both, drop duplicates, and re-obtain the index? That looks super cumbersome…
pd.concat([df.loc[setA], df.loc[setB]]).index.drop_duplicates()
Surely there must be a better way than just brute-force getting an unnecessary amount of data, dropping what is unnecessary, and then getting the index (which I had in the first place anyway)
Advertisement
Answer
I think you are looking for Index.union
setA.union(setB)
Int64Index([0, 1], dtype='int64')