I’m trying to replicate R’s fitdist()
results (reference, cannot modify R code) in Python using scipy.stats. The results are quite close but still different (difference is at not acceptable level). Does anybody know why the results are different? How can I reduce the difference between the results?
scipy_stats.weibull_min
definition (https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.weibull_min.html) seems to be the same as R’s weibull (https://stat.ethz.ch/R-manual/R-devel/library/stats/html/Weibull.html.
Data example:
data = [2457.145, 878.081, 855.118, 1157.135, 1099.82]
R:
parameters <- fitdist(data, 'weibull',"mle")$estimate
R Results:
shape scale 2.30804 1463.88528
Python:
import scipy.stats as st st.weibull_min.fit(data, floc=0)
Python results:
(2.307899817944195, 0, 1463.7712925885176)
Advertisement
Answer
The difference appears to be the result of the default relative tolerances used by the optimizers (and normal floating point imprecision). If you tighten the tolerance in the R calculation, the result is closer to the SciPy result:
> parameters <- fitdist(data, "weibull", method="mle", control=list(reltol=1e-14))$estimate > parameters shape scale 2.3079 1463.7715