I made the toy CNN model.
class Test(nn.Module): def __init__(self): super(Test, self).__init__() self.conv = nn.Sequential( nn.Conv2d(3,300,3), nn.Conv2d(300,500,3), nn.Conv2d(500,1000,3), ) self.fc = nn.Linear(3364000,1) def forward(self, x): out = self.conv(x) out = out.view(out.size(0), -1) out = self.fc(out) return out
Then, I had checked model.summary via this code
model = Test() model.to('cuda') for param in model.parameters(): print(param.dtype) break summary_(model, (3,64,64))
And I was able to get the following results:
torch.float32 ---------------------------------------------------------------- Layer (type) Output Shape Param # ================================================================ Conv2d-1 [-1, 300, 62, 62] 8,400 Conv2d-2 [-1, 500, 60, 60] 1,350,500 Conv2d-3 [-1, 1000, 58, 58] 4,501,000 Linear-4 [-1, 1] 3,364,001 ================================================================ Total params: 9,223,901 Trainable params: 9,223,901 Non-trainable params: 0 ---------------------------------------------------------------- Input size (MB): 0.05 Forward/backward pass size (MB): 48.20 Params size (MB): 35.19 Estimated Total Size (MB): 83.43 ----------------------------------------------------------------
I want to reduce model size cuz i wanna increase the batch size.
So, I had changed torch.float32
-> torch.float16
via NVIDIA/apex
model = Test() model.to('cuda') opt_level = 'O3' optimizer = optim.Adam(model.parameters(), lr=0.001) model, optimizer = amp.initialize(model, optimizer, opt_level=opt_level) for param in model.parameters(): print(param.dtype) break summary_(model, (3,64,64))
Selected optimization level O3: Pure FP16 training. Defaults for this optimization level are: enabled : True opt_level : O3 cast_model_type : torch.float16 patch_torch_functions : False keep_batchnorm_fp32 : False master_weights : False loss_scale : 1.0 Processing user overrides (additional kwargs that are not None)... After processing overrides, optimization options are: enabled : True opt_level : O3 cast_model_type : torch.float16 patch_torch_functions : False keep_batchnorm_fp32 : False master_weights : False loss_scale : 1.0 torch.float16 ---------------------------------------------------------------- Layer (type) Output Shape Param # ================================================================ Conv2d-1 [-1, 300, 62, 62] 8,400 Conv2d-2 [-1, 500, 60, 60] 1,350,500 Conv2d-3 [-1, 1000, 58, 58] 4,501,000 Linear-4 [-1, 1] 3,364,001 ================================================================ Total params: 9,223,901 Trainable params: 9,223,901 Non-trainable params: 0 ---------------------------------------------------------------- Input size (MB): 0.05 Forward/backward pass size (MB): 48.20 Params size (MB): 35.19 Estimated Total Size (MB): 83.43 ----------------------------------------------------------------
As a result, torch.dtype
was changed torch.float16
from torch.float32
.
But, Param size (MB): 35.19
was not changed.
Why happen this? plz tell me about this.
Thanks.
Advertisement
Answer
Mixed precision does not mean that your model becomes half original size. The parameters remain in float32
dtype by default and they are cast to float16
automatically during certain operations of the neural network training. This is applicable to input data as well.
The torch.cuda.amp
provides the functionality to perform this automatic conversion from float32
to float16
during certain operations of training like Convolutions. Your model size will remain the same. Reducing model size is called quantization
and it is different than mixed-precision training.
You can read to more about mixed-precision training at NVIDIA’s blog and Pytorch’s blog.