I am trying to replace None (not recognized as a string) with nan — and fill those nans with the mode of the field, but when I further condense the field — None appears back in the output. What am I missing?
final_df.Current_Housing_Living_Status__c.unique() Ouput: array([nan, 'Living with family', 'Rent', 'Living with friends/others', 'Homeless', None, 'Emergency Shelter', 'Own', 'Supportive Housing', 'Transitional Housing', 'MHSA treatment facility'], dtype=object) final_df.replace(to_replace=[None], value=np.nan, inplace=True) Ouput: array([nan, 'Living with family', 'Rent', 'Living with friends/others', 'Homeless', 'Emergency Shelter', 'Own', 'Supportive Housing', 'Transitional Housing', 'MHSA treatment facility'], dtype=object) final_df['Current_Housing_Living_Status__c'].fillna(final_df['Current_Housing_Living_Status__c'].mode()[0], inplace = True) #Some Dimension Reduction Own=['Own'] Rent=['Rent'] Homeless=['Homeless','Emergency Shelter', 'Supportive Housing', 'Transitional Housing'] Live_with_Others=['Living with family', 'Living with friends/others'] Treatment_Facility=['MHSA treatment facility'] def reduce_housing_status(x): if x in Own: return 'Own' elif x in Rent: return 'Rent' elif x in Homeless: 'Homeless' elif x in Live_with_Others: return 'Live_with_Others' elif x in Treatment_Facility: return 'Treatment_Facility' else: return x final_df['Current_Housing_Living_Status__c'] = final_df['Current_Housing_Living_Status__c'].apply(reduce_housing_status) final_df.Current_Housing_Living_Status__c.unique() Ouput: array(['Rent', 'Live_with_Others', None, 'Own', 'Treatment_Facility'], dtype=object)
None is back… What am I missing/doing wrong here? If I rerun that last section, None will disappear – but its worrisome to see it appear in the output, then disappear upon a second run.
Advertisement
Answer
Inside your reduce_housing_status
function you forgot to add a return statement when x in Homeless
:
elif x in Homeless: 'Homeless'
Which means you’re implicitly returning None