Skip to content
Advertisement

Incomparable weight shape between caffe and tensorflow / keras

I am trying to convert a caffe model to keras, I have successfully been able to use both MMdnn and even caffe-tensorflow. The output I have are .npy files and .pb files. I have not had much luck with the .pb files, so I stuck to .npy files which contain the weights and biases. I have reconstructed an mAlexNet network as follows:

JavaScript

Then I try to load the weights using this code snippet:

JavaScript

During this process I get an error:

ValueError: Layer conv1 weight shape (16,) is not compatible with provided weight shape (1, 1, 1, 16).

Now as I understand this is because of the different backends and how they initialize weights, but I have not found a way to solve this problem. My question is, how do I tweak the weights loaded from the file to fit my keras model? Link to weights.npy file https://drive.google.com/file/d/1QKzY-WxiUnf9VnlhWQS38DE3uF5I_qTl/view?usp=sharing.

Advertisement

Answer

The problem is the bias vector. It is shaped as a 4D tensor but Keras assumes it is a 1D tensor. Just flatten the bias vector:

JavaScript

As a sanity check, once I create your model I will access the conv1 weights and your corresponding weights you cached then compare them both:

JavaScript

The same for the biases:

JavaScript

Notice that I didn’t have to flatten the biases from your cached results because np.allclose flattens singleton dimensions internally.

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement