(NOTE: Properly fixing the RNG state before each model creating as described in comment in comment practically fixed my problem, as within 3 decimals results are consistent, but they aren’t exactly so, so there’s somewhere a hidden source of randomness not fixed by seeding the RNG… probably some lib uses time milisecs or smth…if anyone has an idea on that, it would be cool to know, so I will wait and not close question yet :) )
I create a Keras LSTM model (used to predict some time series data, not important what), and every time I try to re-create an identical model (same mode config loaded from json, same weights loaded from file, same args to compile function), I get wildly different results on same train and test data. WHY?
Code is roughly like this:
# fix random import random random.seed(42) # make model & compile model = Sequential([ LSTM(50, input_shape=(None, 1), return_sequences=True), LSTM(100, return_sequences=False), Dense(1), Activation("linear") ]) model.compile(loss="mse", optimizer="rmsprop") # save it and its initial random weights model_json = model.to_json() model.save_weights("model.h5") # fit and predict model.fit(x_train, y_train, epochs=3) r = model.predict(x_test) # create new "identical" model model2 = model_from_json(model_json) model2.load_weights("model.h5") model2.compile(loss="mse", optimizer="rmsprop") # fit and predict "identical" model model2.fit(x_train, y_train, epochs=3) r2 = model2.predict(x_test) # ...different results :(
I know that the model has initial random weights, so I’m saving them up and reloading them. I’m also paranoid enough to assume there are some “hidden” params that I may not know of, so I serialize model to json and reload instead of recreating an identical one by hand (tried that, same thing btw). And I also fixed the random number generator.
It’s my first time with Keras, and I’m also a beginners to neural networks in general. But this this drives me crazy… wtf can vary?!
On fixing random number generators: I run Keras with the TensorFlow backend, and I have these lines of code at the start to try and fix the RNGs for experimental purposes:
import random random.seed(42) import numpy numpy.random.seed(42) from tensorflow import set_random_seed set_random_seed(42)
…but they still don’t fix the randomness.
And I understand that the goal is to make my model to behave non-randomly despite the inherent stochastic nature of NNs. But I need to temporarily fix this for experimental purposes (I’m even OK with it being reproducible on one machine only!).
Advertisement
Answer
Machine learning algorithms in general are non-deterministic. This means that every time you run them the outcome should vary. This has to do with the random initialization of the weights. If you want to make the results reproducible you have to eliminate the randomness from the table. A simple way to do this is to use a random seed.
import numpy as np import tensorflow as tf np.random.seed(1234) tf.random.set_seed(1234) # rest of your code
If you want the randomness factor but not so high variance in your output, I would suggest either lowering your learning rate or changing your optimizer (I would suggest an SGD optimizer with a relatively low learning rate). A cool overview of gradient descent optimization is available here!
A note on TensorFlow’s random generators is that besides a global seed (i.e. tf.random.set_seed()
), they also use an internal counter, so if you run
tf.random.set_seed(1234) print(tf.random.uniform([1]).numpy()) print(tf.random.uniform([1]).numpy())
You’ll get 0.5380393
and 0.3253647
, respectively. However if you re-run that same snippet, you’ll get the same two numbers again.
A detailed explanation of how random seeds work in TensorFlow can be found here.
For newer TF versions take care of this too: TensorFlow 2.2 ships with a os environment variable TF_DETERMINISTIC_OPS
which if set to '1'
, will ensure that only deterministic GPU ops are used.