This seems like a fairly simple thing but I haven’t been able to find an answer for it here (yet).
I have a list of dictionaries, and some of the dictionaries in the list have NaN values. I just need to drop any dictionary from the list if it has a NaN value in it.
I’ve tried it a few different ways myself. Here’s one attempt with filter and a lambda function, which got a TypeError (“must be real number, not dict_values,” which makes sense):
from math import isnan def remove_dictionaries_missing_data(list_of_dictionaries): return list(filter(lambda dictionary: not math.isnan(dictionary.values()), list_of_dictionaries))
I also tried it with a couple nested loops and some code I really wasn’t sure about and got the same error:
from math import isnan def remove_dictionaries_missing_data(list_of_dictionaries): cleaned_list = [] for dictionary in list_of_dictionaries: if not math.isnan(dictionary[value] for value in dictionary.values()): cleaned_list.append(dictionary) return cleaned_list
… and finally with just a list comprehension (same error):
from math import isnan def remove_movies_missing_data(movies): return [movie for movie in movies if not math.isnan(movie.values())]
EDIT:
Here’s a sample of the list I’m working with:
[{'year': 2013, 'imdb': 'tt2005374', 'title': 'The Frozen Ground', 'test': 'nowomen-disagree', 'clean_test': 'nowomen', 'binary': 'FAIL', 'budget': 19200000, 'domgross': nan, 'intgross': nan, 'code': '2013FAIL', 'budget_2013$': 19200000, 'domgross_2013$': nan, 'intgross_2013$': nan, 'period code': 1.0, 'decade code': 1.0}, {'year': 2011, 'imdb': 'tt1422136', 'title': 'A Lonely Place to Die', 'test': 'ok', 'clean_test': 'ok', 'binary': 'PASS', 'budget': 4000000, 'domgross': nan, 'intgross': 442550.0, 'code': '2011PASS', 'budget_2013$': 4142763, 'domgross_2013$': nan, 'intgross_2013$': 458345.0, 'period code': 1.0, 'decade code': 1.0}, ... ]
Advertisement
Answer
dictionary.values()
is a generator for all the values in the dictionary. You need to call math.isnan()
on the individual values. You can use any()
to do this:
def remove_dictionarries_missing_data(list_of_dictionaries): return [d for d in list_of_dictionaries if not any(isinstance(val, float) and math.isnan(val) for val in d.values())]