I’m loading a local csv file that contains data. I’m trying to find the smallest float in a row thats mixed of NaN
and numbers.
I have tried using the numpy function called np.nanmin
, but it throws:
JavaScript
x
2
1
"TypeError: '<=' not supported between instances of 'str' and 'float'"
2
JavaScript
1
7
1
database = pd.read_csv('database.csv',quotechar='"',skipinitialspace=True, delimiter=',')
2
3
coun_weight = database[['Country of Operator/Owner', 'Launch Mass (Kilograms)']]
4
print(coun_weight)
5
6
lightest = np.nanmin(coun_weight['Launch Mass (Kilograms)'])
7
Any suggestions to why nanmin
might not work?
A link to the entire csv file: http://www.sharecsv.com/s/5aea6381d1debf75723a45aacd40abf8/database.csv
Here is a sample of my coun_weight:
JavaScript
1
20
20
1
Country of Operator/Owner Launch Mass (Kilograms)
2
1390 China NaN
3
1391 China 1040
4
1392 China 1040
5
1393 China 2700
6
1394 China 2700
7
1395 China 1800
8
1396 China 2700
9
1397 China NaN
10
1398 China NaN
11
1399 China NaN
12
1400 China NaN
13
1401 India 92
14
1402 Russia 45
15
1403 South Africa 1
16
1404 China NaN
17
1405 China 4
18
1406 China 4
19
1407 China 12
20
Advertisement
Answer
I try test it and all problematic values are:
JavaScript
1
10
10
1
coun_weight = pd.read_csv('database.csv')
2
3
print (coun_weight.loc[pd.to_numeric(coun_weight['Launch Mass (Kilograms)'], errors='coerce').isnull(), 'Launch Mass (Kilograms)'].dropna())
4
1091 5,000+
5
1092 5,000+
6
1093 5,000+
7
1094 5,000+
8
1096 5,000+
9
Name: Launch Mass (Kilograms), dtype: object
10
And solution is:
JavaScript
1
13
13
1
coun_weight['Launch Mass (Kilograms)'] =
2
coun_weight['Launch Mass (Kilograms)'].replace('5,000+', 5000).astype(float)
3
4
print (coun_weight['Launch Mass (Kilograms)'].iloc[1091:1098])
5
1091 5000.0
6
1092 5000.0
7
1093 5000.0
8
1094 5000.0
9
1095 NaN
10
1096 5000.0
11
1097 6500.0
12
Name: Launch Mass (Kilograms), dtype: float64
13
Then if need find minimal values with NaN
s – Series.min
, where NaN
s are skipped:
JavaScript
1
3
1
print (coun_weight['Launch Mass (Kilograms)'].min())
2
0.0
3
Testing if some 0
are in column:
JavaScript
1
5
1
a = coun_weight['Launch Mass (Kilograms)']
2
print (a[a == 0])
3
912 0.0
4
Name: Launch Mass (Kilograms), dtype: float64
5
Another possible solution is replace this values to NaN
s:
JavaScript
1
13
13
1
coun_weight['Launch Mass (Kilograms)'] =
2
pd.to_numeric(coun_weight['Launch Mass (Kilograms)'], errors='coerce')
3
4
print (coun_weight['Launch Mass (Kilograms)'].iloc[1091:1098])
5
1091 NaN
6
1092 NaN
7
1093 NaN
8
1094 NaN
9
1095 NaN
10
1096 NaN
11
1097 6500.0
12
Name: Launch Mass (Kilograms), dtype: float64
13