I am trying to construct a Convolutional Neural Network using pytorch
and can not understand how to interpret the input neurons for the first densely connected layer. Say, for example, I have the following architecture:
self.conv_layer = nn.Sequential( nn.Conv2d(3, 32, 5), nn.Conv2d(32, 64, 5), nn.MaxPool2d(2, 2), nn.Conv2d(64, 128, 5), nn.Conv2d(128, 128, 5), nn.MaxPool2d(2, 2)) self.fc_layer = nn.Sequential( nn.Linear(X, 512), nn.Linear(512, 128), nn.Linear(128, 10))
Here X
would be the number of neurons in the first linear layer. So, do I need to keep track of the shape of the output tensor at each layer so that I can figure out X
?
Now, I can put the values in the formula (W - F + 2P) / S + 1
and calculate the shape after each layer, that would be somewhat convenient.
Isn’t there something even more convenient which might do this automatically?
Advertisement
Answer
An easy solution would be to use LazyLinear
layer: https://pytorch.org/docs/stable/generated/torch.nn.LazyLinear.html.
According to the documentation:
A
torch.nn.Linear
module wherein_features
is inferred … They will be initialized after the first call toforward
is done and the module will become a regulartorch.nn.Linear
module. Thein_features
argument of the Linear is inferred from theinput.shape[-1]
.