Skip to content
Advertisement

Keras LSTM – why different results with “same” model & same weights?

(NOTE: Properly fixing the RNG state before each model creating as described in comment in comment practically fixed my problem, as within 3 decimals results are consistent, but they aren’t exactly so, so there’s somewhere a hidden source of randomness not fixed by seeding the RNG… probably some lib uses time milisecs or smth…if anyone has an idea on that, it would be cool to know, so I will wait and not close question yet :) )

I create a Keras LSTM model (used to predict some time series data, not important what), and every time I try to re-create an identical model (same mode config loaded from json, same weights loaded from file, same args to compile function), I get wildly different results on same train and test data. WHY?

Code is roughly like this:

# fix random
import random
random.seed(42)

# make model & compile
model = Sequential([
    LSTM(50, input_shape=(None, 1), return_sequences=True),
    LSTM(100, return_sequences=False),
    Dense(1),
    Activation("linear")
])
model.compile(loss="mse", optimizer="rmsprop")

# save it and its initial random weights
model_json = model.to_json()
model.save_weights("model.h5")

# fit and predict
model.fit(x_train, y_train, epochs=3)
r = model.predict(x_test)

# create new "identical" model
model2 = model_from_json(model_json)
model2.load_weights("model.h5")
model2.compile(loss="mse", optimizer="rmsprop")

# fit and predict "identical" model
model2.fit(x_train, y_train, epochs=3)
r2 = model2.predict(x_test)

# ...different results :(

I know that the model has initial random weights, so I’m saving them up and reloading them. I’m also paranoid enough to assume there are some “hidden” params that I may not know of, so I serialize model to json and reload instead of recreating an identical one by hand (tried that, same thing btw). And I also fixed the random number generator.

It’s my first time with Keras, and I’m also a beginners to neural networks in general. But this this drives me crazy… wtf can vary?!


On fixing random number generators: I run Keras with the TensorFlow backend, and I have these lines of code at the start to try and fix the RNGs for experimental purposes:

import random
random.seed(42)
import numpy
numpy.random.seed(42)
from tensorflow import set_random_seed
set_random_seed(42)

…but they still don’t fix the randomness.

And I understand that the goal is to make my model to behave non-randomly despite the inherent stochastic nature of NNs. But I need to temporarily fix this for experimental purposes (I’m even OK with it being reproducible on one machine only!).

Advertisement

Answer

Machine learning algorithms in general are non-deterministic. This means that every time you run them the outcome should vary. This has to do with the random initialization of the weights. If you want to make the results reproducible you have to eliminate the randomness from the table. A simple way to do this is to use a random seed.

import numpy as np
import tensorflow as tf

np.random.seed(1234)
tf.random.set_seed(1234)

# rest of your code

If you want the randomness factor but not so high variance in your output, I would suggest either lowering your learning rate or changing your optimizer (I would suggest an SGD optimizer with a relatively low learning rate). A cool overview of gradient descent optimization is available here!


A note on TensorFlow’s random generators is that besides a global seed (i.e. tf.random.set_seed()), they also use an internal counter, so if you run

tf.random.set_seed(1234)
print(tf.random.uniform([1]).numpy())
print(tf.random.uniform([1]).numpy())

You’ll get 0.5380393 and 0.3253647, respectively. However if you re-run that same snippet, you’ll get the same two numbers again.

A detailed explanation of how random seeds work in TensorFlow can be found here.


For newer TF versions take care of this too: TensorFlow 2.2 ships with a os environment variable TF_DETERMINISTIC_OPS which if set to '1', will ensure that only deterministic GPU ops are used.

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement