I’m loading a local csv file that contains data. I’m trying to find the smallest float in a row thats mixed of NaN
and numbers.
I have tried using the numpy function called np.nanmin
, but it throws:
"TypeError: '<=' not supported between instances of 'str' and 'float'"
database = pd.read_csv('database.csv',quotechar='"',skipinitialspace=True, delimiter=',') coun_weight = database[['Country of Operator/Owner', 'Launch Mass (Kilograms)']] print(coun_weight) lightest = np.nanmin(coun_weight['Launch Mass (Kilograms)'])
Any suggestions to why nanmin
might not work?
A link to the entire csv file: http://www.sharecsv.com/s/5aea6381d1debf75723a45aacd40abf8/database.csv
Here is a sample of my coun_weight:
Country of Operator/Owner Launch Mass (Kilograms) 1390 China NaN 1391 China 1040 1392 China 1040 1393 China 2700 1394 China 2700 1395 China 1800 1396 China 2700 1397 China NaN 1398 China NaN 1399 China NaN 1400 China NaN 1401 India 92 1402 Russia 45 1403 South Africa 1 1404 China NaN 1405 China 4 1406 China 4 1407 China 12
Advertisement
Answer
I try test it and all problematic values are:
coun_weight = pd.read_csv('database.csv') print (coun_weight.loc[pd.to_numeric(coun_weight['Launch Mass (Kilograms)'], errors='coerce').isnull(), 'Launch Mass (Kilograms)'].dropna()) 1091 5,000+ 1092 5,000+ 1093 5,000+ 1094 5,000+ 1096 5,000+ Name: Launch Mass (Kilograms), dtype: object
And solution is:
coun_weight['Launch Mass (Kilograms)'] = coun_weight['Launch Mass (Kilograms)'].replace('5,000+', 5000).astype(float) print (coun_weight['Launch Mass (Kilograms)'].iloc[1091:1098]) 1091 5000.0 1092 5000.0 1093 5000.0 1094 5000.0 1095 NaN 1096 5000.0 1097 6500.0 Name: Launch Mass (Kilograms), dtype: float64
Then if need find minimal values with NaN
s – Series.min
, where NaN
s are skipped:
print (coun_weight['Launch Mass (Kilograms)'].min()) 0.0
Testing if some 0
are in column:
a = coun_weight['Launch Mass (Kilograms)'] print (a[a == 0]) 912 0.0 Name: Launch Mass (Kilograms), dtype: float64
Another possible solution is replace this values to NaN
s:
coun_weight['Launch Mass (Kilograms)'] = pd.to_numeric(coun_weight['Launch Mass (Kilograms)'], errors='coerce') print (coun_weight['Launch Mass (Kilograms)'].iloc[1091:1098]) 1091 NaN 1092 NaN 1093 NaN 1094 NaN 1095 NaN 1096 NaN 1097 6500.0 Name: Launch Mass (Kilograms), dtype: float64