I have a set of fairly complicated models that I am training and I am looking for a way to save and load the model optimizer states. The “trainer models” consist of different combinations of several other “weight models”, of which some have shared weights, some have frozen weights depending on the trainer, etc. It is a bit too complicated of an example to share, but in short, I am not able to use model.save('model_file.h5')
and keras.models.load_model('model_file.h5')
when stopping and starting my training.
Using model.load_weights('weight_file.h5')
works fine for testing my model if the training has finished, but if I attempt to continue training the model using this method, the loss does not come even close to returning to its last location. I have read that this is because the optimizer state is not saved using this method which makes sense. However, I need a method for saving and loading the states of the optimizers of my trainer models. It seems as though keras once had a model.optimizer.get_sate()
and model.optimizer.set_sate()
that would accomplish what I am after, but that does not seem to be the case anymore (at least for the Adam optimizer). Are there any other solutions with the current Keras?
Advertisement
Answer
You can extract the important lines from the load_model
and save_model
functions.
For saving optimizer states, in save_model
:
# Save optimizer weights. symbolic_weights = getattr(model.optimizer, 'weights') if symbolic_weights: optimizer_weights_group = f.create_group('optimizer_weights') weight_values = K.batch_get_value(symbolic_weights)
For loading optimizer states, in load_model
:
# Set optimizer weights. if 'optimizer_weights' in f: # Build train function (to get weight updates). if isinstance(model, Sequential): model.model._make_train_function() else: model._make_train_function() # ... try: model.optimizer.set_weights(optimizer_weight_values)
Combining the lines above, here’s an example:
- First fit the model for 5 epochs.
X, y = np.random.rand(100, 50), np.random.randint(2, size=100) x = Input((50,)) out = Dense(1, activation='sigmoid')(x) model = Model(x, out) model.compile(optimizer='adam', loss='binary_crossentropy') model.fit(X, y, epochs=5) Epoch 1/5 100/100 [==============================] - 0s 4ms/step - loss: 0.7716 Epoch 2/5 100/100 [==============================] - 0s 64us/step - loss: 0.7678 Epoch 3/5 100/100 [==============================] - 0s 82us/step - loss: 0.7665 Epoch 4/5 100/100 [==============================] - 0s 56us/step - loss: 0.7647 Epoch 5/5 100/100 [==============================] - 0s 76us/step - loss: 0.7638
- Now save the weights and optimizer states.
model.save_weights('weights.h5') symbolic_weights = getattr(model.optimizer, 'weights') weight_values = K.batch_get_value(symbolic_weights) with open('optimizer.pkl', 'wb') as f: pickle.dump(weight_values, f)
- Rebuild the model in another python session, and load weights.
x = Input((50,)) out = Dense(1, activation='sigmoid')(x) model = Model(x, out) model.compile(optimizer='adam', loss='binary_crossentropy') model.load_weights('weights.h5') model._make_train_function() with open('optimizer.pkl', 'rb') as f: weight_values = pickle.load(f) model.optimizer.set_weights(weight_values)
- Continue model training.
model.fit(X, y, epochs=5) Epoch 1/5 100/100 [==============================] - 0s 674us/step - loss: 0.7629 Epoch 2/5 100/100 [==============================] - 0s 49us/step - loss: 0.7617 Epoch 3/5 100/100 [==============================] - 0s 49us/step - loss: 0.7611 Epoch 4/5 100/100 [==============================] - 0s 55us/step - loss: 0.7601 Epoch 5/5 100/100 [==============================] - 0s 49us/step - loss: 0.7594