I have a model that I am using transfer learning for MobileNetV2 and I’d like to quantize it and compare the accuracy difference against a non-quantized model with transfer learning. However, they do not entirely support recursive quantization, but according to this, this method should quantize my model: https://github.com/tensorflow/model-optimization/issues/377#issuecomment-820948555
What I tried doing was:
import tensorflow as tf import tensorflow_model_optimization as tfmot pretrained_model = tf.keras.applications.MobileNetV2(include_top=False) pretrained_model.trainable = True for layer in pretrained_model.layers[:-1]: layer.trainable = False quantize_model_pretrained = tfmot.quantization.keras.quantize_model q_pretrained_model = quantize_model_pretrained(pretrained_model) original_inputs = tf.keras.layers.Input(shape=(224, 224, 3)) y = tf.keras.layers.experimental.preprocessing.Rescaling(1./255)(original_inputs) y = base_model(original_inputs) y = tf.keras.layers.GlobalAveragePooling2D()(y) original_outputs = tf.keras.layers.Dense(5, activation="softmax")(y) model_1 = tf.keras.Model(original_inputs, original_outputs) quantize_model = tfmot.quantization.keras.quantize_model q_aware_model = quantize_model(model_1)
It is still giving me the following error:
ValueError: Quantizing a tf.keras Model inside another tf.keras Model is not supported.
I’d like to understand what is the correct way to perform quantization-aware-training in this case?
Advertisement
Answer
According to the issue you mentioned, you should quantize each model separately and then bring them together afterwards. Something like this:
import tensorflow as tf import tensorflow_model_optimization as tfmot pretrained_model = tf.keras.applications.MobileNetV2(input_shape=(224, 224, 3), include_top=False) pretrained_model.trainable = True for layer in pretrained_model.layers[:-1]: layer.trainable = False q_pretrained_model = tfmot.quantization.keras.quantize_model(pretrained_model) q_base_model = tfmot.quantization.keras.quantize_model(tf.keras.Sequential([tf.keras.layers.GlobalAveragePooling2D(input_shape=(7, 7, 1280)), tf.keras.layers.Dense(5, activation="softmax")])) original_inputs = tf.keras.layers.Input(shape=(224, 224, 3)) y = tf.keras.layers.experimental.preprocessing.Rescaling(1./255)(original_inputs) y = q_pretrained_model(original_inputs) original_outputs = q_base_model(y) model = tf.keras.Model(original_inputs, original_outputs)
It does not look like it is already supported out of the box, even though this is claimed.