How to implement batch normalization merging in python?

Question

I have defined the model as in the code below, and I used batch normalization merging to make 3 layers into 1 linear layer. The first layer of the model is a linear layer and there is no bias. The second layer of the model is a batch normalization and there is no weight and bias ( affine is false

Accepted Answer

Short Answer: As far as I can tell you need a model.eval() before the lineinput = torch.randn(batch_size, in_nodes)such that the end looks like this...model.eval()input = torch.randn(batch_size, in_nodes)test_input = torch.ones(batch_size,internal_nodes)/100print(model(input))print(torch.t(torch.mm(new_weight, torch.t(input))) + new_bias)with that (I tested it) the two print-statements should output the same. It fixed the weights.Long Answer:When using Batch-Normalization according to PyTorch documentation a default momentum of 0.1 is used to compute the running_mean and running_var. The momentum defines how much the estimated statistics and how much the new observed value influence the value.Now when you don&#8217;t set a model.eval() statement the batch_normalization computes an updated running_mean and running_var due to the momentum in lineprint(model(input))For further details and or confirmation: Related Question, PyTorch-Documentation

Advertisement

Answer