First of All, Sorry for my poor English and Thanks for clicking this Question.
I already have x and y data sets, so I want to do curve fitting with my data sets.
then How can I estimate constants of this Model by polyfit?
I know
np.polyfit(x,y,1)
means Linear Equation Estimating. (1 means Linear)
but How can I estimate using another equation like square root with three or more constants with my data sets.
Advertisement
Answer
You can use scipy.optimize.curve_fit
, here is an example how you can do this
import numpy as np import matplotlib.pyplot as plt from scipy.optimize import curve_fit def func(x,a,b,c): return a * np.sqrt(x - b) + c x = np.linspace(2,20,100) y = func(x,2,-2,3) y_true = y + 0.1*np.random.normal(size=len(x)) popt, pcov = curve_fit(func,x,y_true) y_pred = func(x,*popt) fig,ax = plt.subplots(figsize=(8,6)) ax.scatter(x,y_true,c='r',label='true',s=6) ax.plot(x,y_pred,c='g',label='pred') ax.legend(loc='best')
this will give you
The array popt
is the list of (a,b,c)
values.
UPDATE
After testing curve_fit
using the real dataset provided by reaver lover, I was surprised to find that curve_fit
can fail on this relatively simple regression task.
import numpy as np import matplotlib.pyplot as plt from scipy.optimize import curve_fit def func(x,a,b,c): print('%.3f, %.3f, %.3f' % (a,b,c)) return a * np.sqrt(x - b) + c x = np.array([5, 11, 15, 44, 60, 70, 75, 100, 120, 200]) y_true = np.array([2.492, 8.330, 11.000, 19.394, 24.466, 27.777, 29.878, 26.952, 35.607, 46.966]) popt, pcov = curve_fit(func,x,y_true) popt = [2.252, 5.000, 6.908] y_pred = func(x,*popt) fig,ax = plt.subplots(figsize=(8,6)) ax.scatter(x,y_true,c='r',label='true',s=6) ax.plot(x,y_pred,c='g',label='pred') ax.legend(loc='best')
Running this script, you will find the list of coefficients (a,b,c)
somehow becomes (nan,nan,nan)
near the end of optimization. However, the last (a,b,c)
that is not (nan,nan,nan)
found by curve_fit
has already been good enough, as you can see in the plot
I’m really clueless why curve_fit
can fail.