I have a pandas DataFrame with a column of integers. I want the rows containing numbers greater than 10. I am able to evaluate True or False but not the actual value, by doing:
JavaScript
x
2
1
df['ints'] = df['ints'] > 10
2
I don’t use Python very often so I’m going round in circles with this.
I’ve spent 20 minutes Googling but haven’t been able to find what I need….
Edit:
JavaScript
1
7
1
observationID recordKey gridReference siteKey siteName featureKey startDate endDate pTaxonVersionKey taxonName authority commonName ints
2
0 463166539 1767 SM90 NaN NaN 150161 12/02/2006 12/02/2006 NBNSYS0100004720 Pipistrellus pygmaeus (Leach, 1825) Soprano Pipistrelle 2006
3
1 463166623 4325 TL65 NaN NaN 168651 21/12/2008 21/12/2008 NHMSYS0020001355 Pipistrellus pipistrellus sensu stricto (Schreber, 1774) Common Pipistrelle 2008
4
2 463166624 4326 TL65 NaN NaN 168651 18/01/2009 18/01/2009 NHMSYS0020001355 Pipistrellus pipistrellus sensu stricto (Schreber, 1774) Common Pipistrelle 2009
5
3 463166625 4327 TL65 NaN NaN 168651 15/02/2009 15/02/2009 NHMSYS0020001355 Pipistrellus pipistrellus sensu stricto (Schreber, 1774) Common Pipistrelle 2009
6
4 463166626 4328 TL65 NaN NaN 168651 19/12/2009 19/12/2009 NHMSYS0020001355 Pipistrellus pipistrellus sensu stricto (Schreber, 1774) Common Pipistrelle 2009
7
Advertisement
Answer
Sample DF:
JavaScript
1
16
16
1
In [79]: df = pd.DataFrame(np.random.randint(5, 15, (10, 3)), columns=list('abc'))
2
3
In [80]: df
4
Out[80]:
5
a b c
6
0 6 11 11
7
1 14 7 8
8
2 13 5 11
9
3 13 7 11
10
4 13 5 9
11
5 5 11 9
12
6 9 8 6
13
7 5 11 10
14
8 8 10 14
15
9 7 14 13
16
present only those rows where b > 10
JavaScript
1
8
1
In [81]: df[df.b > 10]
2
Out[81]:
3
a b c
4
0 6 11 11
5
5 5 11 9
6
7 5 11 10
7
9 7 14 13
8
Minimums (for all columns) for the rows satisfying b > 10
condition
JavaScript
1
7
1
In [82]: df[df.b > 10].min()
2
Out[82]:
3
a 5
4
b 11
5
c 9
6
dtype: int32
7
Minimum (for the b
column) for the rows satisfying b > 10
condition
JavaScript
1
3
1
In [84]: df.loc[df.b > 10, 'b'].min()
2
Out[84]: 11
3
UPDATE: starting from Pandas 0.20.1 the .ix indexer is deprecated, in favor of the more strict .iloc and .loc indexers.