I am trying to construct a Convolutional Neural Network using pytorch and can not understand how to interpret the input neurons for the first densely connected layer. Say, for example, I have the following architecture:
self.conv_layer = nn.Sequential( nn.Conv2d(3, 32, 5), nn.Conv2d(32, 64, 5), nn.MaxPool2d(2, 2), nn.Conv2d(64, 128, 5), nn.Conv2d(128, 128, 5), nn.MaxPool2d(2, 2)) self.fc_layer = nn.Sequential( nn.Linear(X, 512), nn.Linear(512, 128), nn.Linear(128, 10))
Here X would be the number of neurons in the first linear layer. So, do I need to keep track of the shape of the output tensor at each layer so that I can figure out X?
Now, I can put the values in the formula (W - F + 2P) / S + 1 and calculate the shape after each layer, that would be somewhat convenient.
Isn’t there something even more convenient which might do this automatically?
Advertisement
Answer
An easy solution would be to use LazyLinear layer: https://pytorch.org/docs/stable/generated/torch.nn.LazyLinear.html.
According to the documentation:
A
torch.nn.Linearmodule wherein_featuresis inferred … They will be initialized after the first call toforwardis done and the module will become a regulartorch.nn.Linearmodule. Thein_featuresargument of the Linear is inferred from theinput.shape[-1].