I have a model with several dense layers that behaves normally in all aspects.
Then, I add weights to the training events (their values are between 0 and 1):
w = mydata.Weight #... kfold = GroupKFold(n_splits=num_folds) for train, test in kfold.split(X, y, groups=groups): X_train, X_test = X.iloc[train], X.iloc[test] y_train, y_test = y.iloc[train], y.iloc[test] w_train = w.iloc[train] #... le_fit = model.fit(X_train, y_train, batch_size=200, epochs=10, sample_weight=w_train, verbose=0) #... predictions = np.rint(model.predict(X_test))
and the prediction becomes useless:
InvalidArgumentError: `predictions` contains negative values Condition x >= 0 did not hold element-wise: x (confusion_matrix_1/Cast:0) = [-9223372036854775808 .......
Just to be safe, I added constraints in the layers, eg:
layers.Dense(units=800, activation='relu', kernel_constraint=constraints.MinMaxNorm(min_value=0.0, max_value=1.0))
but nothing changed.
Can you suggest what is going wrong?
Edit: I now realized that the training loss is also a nan.
Edit: I made all weights equal to one. The results don’t change.
Edit: I don’t know why this question was closed as asking for debugging. The answer makes it obvious that it wasn’t about debugging. It is about the correct usage of two very commonly used items (Keras with GroupKFold), which turns out to include a counter-intuitive element, and it is not problem-specific.
Advertisement
Answer
The problem was that sample_weight
takes np.array as input, but w_train
was an ndarray.
It was solved by creating explicitly an array:
w_train_tmp = w.iloc[train] w_train = np.array(w_train_tmp)
Note: I know that np.array and ndarray are technically the same thing. If someone can clarify why they weren’t in this case, you are most welcome.