def func(x,a,b): return a*x + b guess = (0.5,0.5,0.5) fit_df = df.dropna() col_params = {} for col in fit_df.columns: x = fit_df.index.astype(float).values y = fit_df[col].values params = curve_fit(func,x,y,guess) col_params[col] = params[0] for col in df.columns: x = df[pd.isnull(df[col])].index.astype(float).values df[col][x] = func(x,*col_params[col]) print("Extrapolated data") print(df)
I’m using the code from another post to extrapolate values. I changed the func() so that it is linear not cubic however I get an error “func() takes 3 positional arguments but 4 were give”
Extrapolate values in Pandas DataFrame is where I got the original code. My question is how would I change it so it works with a linear relationship
Advertisement
Answer
When using:
guess = (0.5,0.5)
you should be able to make it run.
You have the parameters a, b
while the original example had the parameters a, b, c, d
.
The initial guess is for the parameters a, b
in your function, not for x
.
Full code used to make your interpolation function run:
import pandas as pd from io import StringIO from scipy.optimize import curve_fit df = pd.read_table(StringIO(''' neg neu pos avg 0 NaN NaN NaN NaN 250 0.508475 0.527027 0.641292 0.558931 500 NaN NaN NaN NaN 1000 0.650000 0.571429 0.653983 0.625137 2000 NaN NaN NaN NaN 3000 0.619718 0.663158 0.665468 0.649448 4000 NaN NaN NaN NaN 6000 NaN NaN NaN NaN 8000 NaN NaN NaN NaN 10000 NaN NaN NaN NaN 20000 NaN NaN NaN NaN 30000 NaN NaN NaN NaN 50000 NaN NaN NaN NaN'''), sep='s+') # Do the original interpolation df.interpolate(method='nearest', xis=0, inplace=True) # Display result print ('Interpolated data:') print (df) print () def func(x,a,b): return a*x + b guess = (0.5,0.5) fit_df = df.dropna() col_params = {} for col in fit_df.columns: x = fit_df.index.astype(float).values y = fit_df[col].values params = curve_fit(func,x,y,guess) col_params[col] = params[0] for col in df.columns: x = df[pd.isnull(df[col])].index.astype(float).values df[col][x] = func(x,*col_params[col]) print("Extrapolated data") print(df)