I am trying to replace None (not recognized as a string) with nan — and fill those nans with the mode of the field, but when I further condense the field — None appears back in the output. What am I missing?
final_df.Current_Housing_Living_Status__c.unique()
Ouput: array([nan, 'Living with family', 'Rent', 'Living with friends/others',
'Homeless', None, 'Emergency Shelter', 'Own', 'Supportive Housing',
'Transitional Housing', 'MHSA treatment facility'], dtype=object)
final_df.replace(to_replace=[None], value=np.nan, inplace=True)
Ouput: array([nan, 'Living with family', 'Rent', 'Living with friends/others',
'Homeless', 'Emergency Shelter', 'Own', 'Supportive Housing',
'Transitional Housing', 'MHSA treatment facility'], dtype=object)
final_df['Current_Housing_Living_Status__c'].fillna(final_df['Current_Housing_Living_Status__c'].mode()[0], inplace = True)
#Some Dimension Reduction
Own=['Own']
Rent=['Rent']
Homeless=['Homeless','Emergency Shelter', 'Supportive Housing', 'Transitional Housing']
Live_with_Others=['Living with family', 'Living with friends/others']
Treatment_Facility=['MHSA treatment facility']
def reduce_housing_status(x):
if x in Own:
return 'Own'
elif x in Rent:
return 'Rent'
elif x in Homeless:
'Homeless'
elif x in Live_with_Others:
return 'Live_with_Others'
elif x in Treatment_Facility:
return 'Treatment_Facility'
else:
return x
final_df['Current_Housing_Living_Status__c'] = final_df['Current_Housing_Living_Status__c'].apply(reduce_housing_status)
final_df.Current_Housing_Living_Status__c.unique()
Ouput: array(['Rent', 'Live_with_Others', None, 'Own', 'Treatment_Facility'],
dtype=object)
None is back… What am I missing/doing wrong here? If I rerun that last section, None will disappear – but its worrisome to see it appear in the output, then disappear upon a second run.
Advertisement
Answer
Inside your reduce_housing_status function you forgot to add a return statement when x in Homeless:
elif x in Homeless:
'Homeless'
Which means you’re implicitly returning None