I wanted to make own neural network for MNIST data set and for that using tensorflow i am writing the code imported library and dataset then done one hot encoding and after all done the weights and baises assignment and then done the forward propagation with the random values and for back propagation and cost minimization used a loss function to do so but unable to use the optimizer dont know why
import tensorflow as tf import matplotlib.pyplot as plt from tensorflow.keras.datasets.mnist import load_data (x_train, y_train), (x_test, y_test) = load_data() y_train_onehot = tf.one_hot(y_train,10) y_test_onehot = tf.one_hot(y_test,10) n_input = 784 n_hidden_1 = 256 n_hidden_2 = 256 n_classes = 10 weights = { "h1" : tf.Variable(tf.random.normal([n_input,n_hidden_1]),trainable=True), #here we require 784x256 random values "h2" : tf.Variable(tf.random.normal([n_hidden_1,n_hidden_2]),trainable = True), #here we require 256x256 random values 'out' : tf.Variable(tf.random.normal([n_hidden_2,n_classes]),trainable = True) #here we require 256x10 random values } #similarly for biases biases = { "h1" : tf.Variable(tf.random.normal([n_hidden_1]),trainable = True), "h2" : tf.Variable(tf.random.normal([n_hidden_2]),trainable = True), 'out' : tf.Variable(tf.random.normal([n_classes]),trainable = True) } def ForwardPropagation(x, weights, biases): #giving net input to hidden layer1 in_layer1 = tf.add( tf.matmul(x, weights['h1']), biases['h1']) #giving net output of hidden layer1 out_layer1 = tf.nn.relu(in_layer1) in_layer2 = tf.add( tf.matmul(out_layer1, weights['h2']), biases['h2']) out_layer2 = tf.nn.relu(in_layer2) output = tf.add( tf.matmul(out_layer2, weights['out']), biases['out']) return output x_train_modified = tf.reshape(x_train , shape=[60000,784]) x_train_modified = tf.cast(x_train_modified, dtype=tf.float32) x_train_modified pred = ForwardPropagation(x_train_modified,weights,biases) predictions = tf.argmax(pred,1) y = y_train_onehot actualvalues = tf.argmax(y,1) actualvalues loss = lambda : tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits= pred , labels = y)) optimizer = tf.keras.optimizers.Adam(learning_rate=0.01) optimize = optimizer.minimize(loss,var_list=[weights['h1'],weights['h2'],weights['out'],biases['h1'],biases['h2'],biases['out']])
after running this giives a error like
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) Input In [30], in <cell line: 2>() 1 optimizer = tf.keras.optimizers.Adam(learning_rate=0.01) ----> 2 optimize = optimizer.minimize(loss,var_list=[weights['h1'],weights['h2'],weights['out'],biases['h1'],biases['h2'],biases['out']]) File ~AppDataLocalProgramsPythonPython310libsite-packageskerasoptimizersoptimizer_v2optimizer_v2.py:539, in OptimizerV2.minimize(self, loss, var_list, grad_loss, name, tape) 507 """Minimize `loss` by updating `var_list`. 508 509 This method simply computes gradient using `tf.GradientTape` and calls (...) 535 536 """ 537 grads_and_vars = self._compute_gradients( 538 loss, var_list=var_list, grad_loss=grad_loss, tape=tape) --> 539 return self.apply_gradients(grads_and_vars, name=name) File ~AppDataLocalProgramsPythonPython310libsite-packageskerasoptimizersoptimizer_v2optimizer_v2.py:640, in OptimizerV2.apply_gradients(self, grads_and_vars, name, experimental_aggregate_gradients) 599 def apply_gradients(self, 600 grads_and_vars, 601 name=None, 602 experimental_aggregate_gradients=True): 603 """Apply gradients to variables. 604 605 This is the second part of `minimize()`. It returns an `Operation` that (...) 638 RuntimeError: If called in a cross-replica context. 639 """ --> 640 grads_and_vars = optimizer_utils.filter_empty_gradients(grads_and_vars) 641 var_list = [v for (_, v) in grads_and_vars] 643 with tf.name_scope(self._name): 644 # Create iteration if necessary. File ~AppDataLocalProgramsPythonPython310libsite-packageskerasoptimizersoptimizer_v2utils.py:73, in filter_empty_gradients(grads_and_vars) 71 if not filtered: 72 variable = ([v.name for _, v in grads_and_vars],) ---> 73 raise ValueError(f"No gradients provided for any variable: {variable}. " 74 f"Provided `grads_and_vars` is {grads_and_vars}.") 75 if vars_with_empty_grads: 76 logging.warning( 77 ("Gradients do not exist for variables %s when minimizing the loss. " 78 "If you're using `model.compile()`, did you forget to provide a `loss`" 79 "argument?"), 80 ([v.name for v in vars_with_empty_grads])) ValueError: No gradients provided for any variable: (['Variable:0', 'Variable:0', 'Variable:0', 'Variable:0', 'Variable:0', 'Variable:0'],). Provided `grads_and_vars` is ((None, <tf.Variable 'Variable:0' shape=(784, 256) dtype=float32, numpy= array([[-0.2042245 , 1.0065362 , 0.79206324, ..., -0.889144 , -0.6301114 , -0.9690762 ], [ 0.19671327, 0.89678144, -0.0324002 , ..., 2.0764246 , -0.02647609, -0.8672084 ], [ 0.07031541, 0.4843207 , 0.08144156, ..., -0.3354637 , 1.2510766 , -0.6774577 ], ..., [-0.63975966, 0.19986708, -0.9221592 , ..., 1.3540815 , -0.9916273 , 0.7312357 ], [ 0.41351947, 0.43665868, 0.1957417 , ..., -1.7249284 , 0.14709948, 1.1250186 ], [ 0.48159713, 0.04631708, -0.1402344 , ..., -0.370907 , 0.19523837, 0.8853921 ]], dtype=float32)>), (None, <tf.Variable 'Variable:0' shape=(256, 256) dtype=float32, numpy= array([[-0.32486504, 0.8220085 , -0.23714782, ..., -0.62537277, 1.0147599 , 0.5973364 ], [ 1.308589 , -0.27249494, 0.65963596, ..., -0.4579711 , ...............................
after running this just simply provide variable 0 for all and not minimizes the cost and returns the randomly assigned values to the weights and variables
How to fix it?
Advertisement
Answer
You are not recording gradient, since you are calculating the prediction “before hand”, instead you want to allow the optimizer to “record” the operations, in other words, you want to compute the prediction inside the “loss” lambda that you are passing to the optimizer:
loss = lambda : tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits( logits= ForwardPropagation(x_train_modified,weights,biases) , labels = y_train_onehot))
also, consider that in your code you are referring to a y
variable that you have never defined, you probably meant y_train_onehot
(which I’ve used in the above snipped)
the reason why this happens, is clearly explained in the minimize
doc: https://www.tensorflow.org/api_docs/python/tf/keras/optimizers/Optimizer#minimize