I am cleaning a dataset using the z-score with a threshold >3. Below is the code that I am using. As you can, I first calculate the mean and std. After the code goes in a loop and checks for every value the z-score and if it is greater than 3 and, if yes, the value is treated as an
Tag: z-score
How to calculate the outliers in a Pandas dataframe while excluding NaN values
I have a pandas dataframe that should look like this. Some values in this dataframe are outliers. I came across this method of calculating the outliers in every colum using the z score: My goal is to create a column Is Outlier and put a True/False on each row that has/doesn’t have at least one outlier and NaN for rows