Skip to content
Advertisement

Missing value Imputation based on regression in pandas

i want to inpute the missing data based on multivariate imputation, in the below-attached data sets, column A has some missing values, and Column A and Column B have the correlation factor of 0.70. So I want to use a regression kind of realationship so that it will build the relation between Column A and Column B and impute the missing values in Python.

N.B.: I can do it using Mean, median, and mode, but I want to use the relationship from another column to fill the missing value.

How to deal the problem. your solution, please

JavaScript

Advertisement

Answer

Use:

JavaScript

Note that in the provided data correlation of the A and B columns are very low (less than .05). For replacing the imputed values with empty cells:

JavaScript

Output:

enter image description here

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement