After I instantiate a scikit model (e.g. LinearRegression
), if I call its fit()
method multiple times (with different X
and y
data), what happens? Does it fit the model on the data like if I just re-instantiated the model (i.e. from scratch), or does it keep into accounts data already fitted from the previous call to fit()
?
Trying with LinearRegression
(also looking at its source code) it seems to me that every time I call fit()
, it fits from scratch, ignoring the result of any previous call to the same method. I wonder if this true in general, and I can rely on this behavior for all models/pipelines of scikit learn.
Advertisement
Answer
If you will execute model.fit(X_train, y_train)
for a second time – it’ll overwrite all previously fitted coefficients, weights, intercept (bias), etc.
If you want to fit just a portion of your data set and then to improve your model by fitting a new data, then you can use estimators, supporting “Incremental learning” (those, that implement partial_fit()
method)