I have a pandas DataFrame with a column of integers. I want the rows containing numbers greater than 10. I am able to evaluate True or False but not the actual value, by doing:
df['ints'] = df['ints'] > 10
I don’t use Python very often so I’m going round in circles with this.
I’ve spent 20 minutes Googling but haven’t been able to find what I need….
Edit:
observationID recordKey gridReference siteKey siteName featureKey startDate endDate pTaxonVersionKey taxonName authority commonName ints 0 463166539 1767 SM90 NaN NaN 150161 12/02/2006 12/02/2006 NBNSYS0100004720 Pipistrellus pygmaeus (Leach, 1825) Soprano Pipistrelle 2006 1 463166623 4325 TL65 NaN NaN 168651 21/12/2008 21/12/2008 NHMSYS0020001355 Pipistrellus pipistrellus sensu stricto (Schreber, 1774) Common Pipistrelle 2008 2 463166624 4326 TL65 NaN NaN 168651 18/01/2009 18/01/2009 NHMSYS0020001355 Pipistrellus pipistrellus sensu stricto (Schreber, 1774) Common Pipistrelle 2009 3 463166625 4327 TL65 NaN NaN 168651 15/02/2009 15/02/2009 NHMSYS0020001355 Pipistrellus pipistrellus sensu stricto (Schreber, 1774) Common Pipistrelle 2009 4 463166626 4328 TL65 NaN NaN 168651 19/12/2009 19/12/2009 NHMSYS0020001355 Pipistrellus pipistrellus sensu stricto (Schreber, 1774) Common Pipistrelle 2009
Advertisement
Answer
Sample DF:
In [79]: df = pd.DataFrame(np.random.randint(5, 15, (10, 3)), columns=list('abc')) In [80]: df Out[80]: a b c 0 6 11 11 1 14 7 8 2 13 5 11 3 13 7 11 4 13 5 9 5 5 11 9 6 9 8 6 7 5 11 10 8 8 10 14 9 7 14 13
present only those rows where b > 10
In [81]: df[df.b > 10] Out[81]: a b c 0 6 11 11 5 5 11 9 7 5 11 10 9 7 14 13
Minimums (for all columns) for the rows satisfying b > 10
condition
In [82]: df[df.b > 10].min() Out[82]: a 5 b 11 c 9 dtype: int32
Minimum (for the b
column) for the rows satisfying b > 10
condition
In [84]: df.loc[df.b > 10, 'b'].min() Out[84]: 11
UPDATE: starting from Pandas 0.20.1 the .ix indexer is deprecated, in favor of the more strict .iloc and .loc indexers.