Skip to content
Advertisement

How to replace irrelevant data into mean values?

Let’s say I have 600,000 data points in column for age. In the data there are values 0 and -1, which is not relevant for age. How can I change both 0 and -1 values in my data to the column mean value using python?

The code so far:

df6 = df5['Vict Age'].replace([0, -1]).mean())
df6.update(df5)
df6

Advertisement

Answer

You can find the mean separatly and then use the correct replace syntax to replace desired values:

# Calculate mean ignoring -1, 0 values
age_mean = df5['Vict Age'][~df5['Vict Age'].isin([-1,0])].mean()
# Replace -1, 0 values
df5['Vict Age'] = df5['Vict Age'].replace({0: age_mean , -1: age_mean})

PS: Please use Stack Overflow code formatting instead of posting the image in the future. Thanks.

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement