First of All, Sorry for my poor English and Thanks for clicking this Question.
I already have x and y data sets, so I want to do curve fitting with my data sets.
then How can I estimate constants of this Model by polyfit?
I know
np.polyfit(x,y,1)
means Linear Equation Estimating. (1 means Linear)
but How can I estimate using another equation like square root with three or more constants with my data sets.
Advertisement
Answer
You can use scipy.optimize.curve_fit
, here is an example how you can do this
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def func(x,a,b,c):
return a * np.sqrt(x - b) + c
x = np.linspace(2,20,100)
y = func(x,2,-2,3)
y_true = y + 0.1*np.random.normal(size=len(x))
popt, pcov = curve_fit(func,x,y_true)
y_pred = func(x,*popt)
fig,ax = plt.subplots(figsize=(8,6))
ax.scatter(x,y_true,c='r',label='true',s=6)
ax.plot(x,y_pred,c='g',label='pred')
ax.legend(loc='best')
this will give you
The array popt
is the list of (a,b,c)
values.
UPDATE
After testing curve_fit
using the real dataset provided by reaver lover, I was surprised to find that curve_fit
can fail on this relatively simple regression task.
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def func(x,a,b,c):
print('%.3f, %.3f, %.3f' % (a,b,c))
return a * np.sqrt(x - b) + c
x = np.array([5, 11, 15, 44, 60, 70, 75, 100, 120, 200])
y_true = np.array([2.492, 8.330, 11.000, 19.394, 24.466, 27.777, 29.878, 26.952, 35.607, 46.966])
popt, pcov = curve_fit(func,x,y_true)
popt = [2.252, 5.000, 6.908]
y_pred = func(x,*popt)
fig,ax = plt.subplots(figsize=(8,6))
ax.scatter(x,y_true,c='r',label='true',s=6)
ax.plot(x,y_pred,c='g',label='pred')
ax.legend(loc='best')
Running this script, you will find the list of coefficients (a,b,c)
somehow becomes (nan,nan,nan)
near the end of optimization. However, the last (a,b,c)
that is not (nan,nan,nan)
found by curve_fit
has already been good enough, as you can see in the plot
I’m really clueless why curve_fit
can fail.