Let say I have the following pandas df
JavaScript
x
5
1
import pandas as pd
2
d = [0.0, 1.0, 2.0]
3
e = pd.Series(d, index = ['a', 'b', 'c'])
4
df = pd.DataFrame({'A': 1., 'B': e, 'C': pd.Timestamp('20130102')})
5
Now I have another array
JavaScript
1
2
1
select = ['c', 'a', 'x']
2
Clearly, the element 'x'
is not available in my original df
. How can I select rows of df
based on select
but choose only available rows without any error? i.e. in this case, I want to select only rows corresponding to 'c'
and 'a'
maintaining this order.
Any pointer will be very helpful.
Advertisement
Answer
You could use reindex
+ dropna
:
JavaScript
1
2
1
out = df.reindex(select).dropna()
2
you could also filter select before reindex
:
JavaScript
1
2
1
out = df.reindex([i for i in select if i in df.index])
2
Output:
JavaScript
1
4
1
A B C
2
c 1.0 2.0 2013-01-02
3
a 1.0 0.0 2013-01-02
4