Let say I have the following pandas df
import pandas as pd
d = [0.0, 1.0, 2.0]
e = pd.Series(d, index = ['a', 'b', 'c'])
df = pd.DataFrame({'A': 1., 'B': e, 'C': pd.Timestamp('20130102')})
Now I have another array
select = ['c', 'a', 'x']
Clearly, the element 'x' is not available in my original df. How can I select rows of df based on select but choose only available rows without any error? i.e. in this case, I want to select only rows corresponding to 'c' and 'a' maintaining this order.
Any pointer will be very helpful.
Advertisement
Answer
You could use reindex + dropna:
out = df.reindex(select).dropna()
you could also filter select before reindex:
out = df.reindex([i for i in select if i in df.index])
Output:
A B C c 1.0 2.0 2013-01-02 a 1.0 0.0 2013-01-02