I have this data:
puf = pd.DataFrame({'id':[1,2,3,4,5,6,7,8], 'val':[850,1889,3289,6083,10349,17860,28180,41236]})
The data seems to follow an exponential curve. Let’s see the plot:
puf.plot('id','val')
I want to fit an exponential curve ($$ y = Ae^{Bx} $$
, A times e to the B*X)and add it as a column in Pandas. Firstly I tried to log the values:
puf['log_val'] = np.log(puf['val'])
And then to use Numpy to fit the equation:
puf['fit'] = np.polyfit(puf['id'],puf['log_val'],1)
But I get an error:
ValueError: Length of values (2) does not match length of index (8)
My expected result is the fitted values as a new column in Pandas. I attach an image with the column fitted values I want (in orange):
I’m stuck in this code. I’m not sure what I am doing wrong. How can I create a new column with my fitted values?
Advertisement
Answer
Your getting that error because np.polyfit(puf['id'],puf['log_val'],1)
returns two values array([0.55110679, 6.39614819])
which isn’t the shape of your dataframe.
This is what you want
y = a* exp (b*x) -> ln(y)=ln(a)+bx f = np.polyfit(df['id'], np.log(df['val']), 1)
where
a = np.exp(f[1]) -> 599.5313046712091 b = f[0] -> 0.5511067934637022
Giving
puf['fit'] = a * np.exp(b * puf['id'])
id val fit 0 1 850 1040.290193 1 2 1889 1805.082864 2 3 3289 3132.130026 3 4 6083 5434.785677 4 5 10349 9430.290286 5 6 17860 16363.179739 6 7 28180 28392.938399 7 8 41236 49266.644002