如何在Pytorch中获取神经网络每层的输出维度?

36
class Model(nn.Module):
  def __init__(self):
    super(Model, self).__init__()
    self.net = nn.Sequential(
      nn.Conv2d(in_channels = 3, out_channels = 16), 
      nn.ReLU(), 
      nn.MaxPool2d(2),
      nn.Conv2d(in_channels = 16, out_channels = 16), 
      nn.ReLU(),
      Flatten(),
      nn.Linear(4096, 64),
      nn.ReLU(),
      nn.Linear(64, 10))

  def forward(self, x):
    return self.net(x)

我在没有牢固的神经网络知识的情况下创建了这个模型,只是通过调整参数来训练直到它有效果。我不确定如何得到每层的输出维度(例如第一层后的输出维度)。

在Pytorch中有什么简单的方法可以做到这一点吗?


1
这个回答解决了你的问题吗?PyTorch中的模型摘要 - iacob
9个回答

29

您可以使用torchsummary,例如,用于ImageNet维度(3x224x224):

from torchvision import models
from torchsummary import summary

vgg = models.vgg16()
summary(vgg, (3, 224, 224)


----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1         [-1, 64, 224, 224]           1,792
              ReLU-2         [-1, 64, 224, 224]               0
            Conv2d-3         [-1, 64, 224, 224]          36,928
              ReLU-4         [-1, 64, 224, 224]               0
         MaxPool2d-5         [-1, 64, 112, 112]               0
            Conv2d-6        [-1, 128, 112, 112]          73,856
              ReLU-7        [-1, 128, 112, 112]               0
            Conv2d-8        [-1, 128, 112, 112]         147,584
              ReLU-9        [-1, 128, 112, 112]               0
        MaxPool2d-10          [-1, 128, 56, 56]               0
           Conv2d-11          [-1, 256, 56, 56]         295,168
             ReLU-12          [-1, 256, 56, 56]               0
           Conv2d-13          [-1, 256, 56, 56]         590,080
             ReLU-14          [-1, 256, 56, 56]               0
           Conv2d-15          [-1, 256, 56, 56]         590,080
             ReLU-16          [-1, 256, 56, 56]               0
        MaxPool2d-17          [-1, 256, 28, 28]               0
           Conv2d-18          [-1, 512, 28, 28]       1,180,160
             ReLU-19          [-1, 512, 28, 28]               0
           Conv2d-20          [-1, 512, 28, 28]       2,359,808
             ReLU-21          [-1, 512, 28, 28]               0
           Conv2d-22          [-1, 512, 28, 28]       2,359,808
             ReLU-23          [-1, 512, 28, 28]               0
        MaxPool2d-24          [-1, 512, 14, 14]               0
           Conv2d-25          [-1, 512, 14, 14]       2,359,808
             ReLU-26          [-1, 512, 14, 14]               0
           Conv2d-27          [-1, 512, 14, 14]       2,359,808
             ReLU-28          [-1, 512, 14, 14]               0
           Conv2d-29          [-1, 512, 14, 14]       2,359,808
             ReLU-30          [-1, 512, 14, 14]               0
        MaxPool2d-31            [-1, 512, 7, 7]               0
           Linear-32                 [-1, 4096]     102,764,544
             ReLU-33                 [-1, 4096]               0
          Dropout-34                 [-1, 4096]               0
           Linear-35                 [-1, 4096]      16,781,312
             ReLU-36                 [-1, 4096]               0
          Dropout-37                 [-1, 4096]               0
           Linear-38                 [-1, 1000]       4,097,000
================================================================
Total params: 138,357,544
Trainable params: 138,357,544
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.57
Forward/backward pass size (MB): 218.59
Params size (MB): 527.79
Estimated Total Size (MB): 746.96
----------------------------------------------------------------

来源:PyTorch中的模型摘要


1
(3, 244, 244) 是什么? - Dawn17
@Dawn17 这是单个图像的尺寸(对于MNIST数据集,它是1x28x28)。 - Idan Azuri
1
我遇到了这个错误 RuntimeError: Expected 4-dimensional input for 4-dimensional weight [64, 3, 3, 3], but got 5-dimensional input of size [2, 1, 3, 224, 224] instead。我需要输入什么尺寸? - Dawn17
@Dawn17 我需要看到你的代码才能帮助你,但我猜想你在你的网络上运行MNIST,它是1x28x28的,而VGG的输入是3x224x224的。所以,在前向方法中首先尝试像这样重新整形:'out.view(out.shape[0], -1)',其次,将模型改为你自己的模型,而不是我的示例中的VGG。 - Idan Azuri
对于此问题的新访问者,请注意torchsummary现在在PIP中已更名为torchinfo:https://pypi.org/project/torchinfo/。 - QAH

21

一个简单的方法是:

  1. 将输入传递给模型。
  2. 在通过每一层之后打印输出大小。
class Model(nn.Module):
  def __init__(self):
    super(Model, self).__init__()
    self.net = nn.Sequential(
      nn.Conv2d(in_channels = 3, out_channels = 16), 
      nn.ReLU(), 
      nn.MaxPool2d(2),
      nn.Conv2d(in_channels = 16, out_channels = 16), 
      nn.ReLU(),
      Flatten(),
      nn.Linear(4096, 64),
      nn.ReLU(),
      nn.Linear(64, 10))

  def forward(self, x):
    for layer in self.net:
        x = layer(x)
        print(x.size())
    return x

model = Model()
x = torch.randn(1, 3, 224, 224)

# Let's print it
model(x)

但是要小心输入大小,因为你在网络中使用了nn.Linear,如果你的输入大小不是4096,它会导致nn.Linear的输入大小不兼容。


什么是(1,3,244,244)? - Dawn17
只需创建一个伪示例的输入:x = torch.randn(1, 3, 244, 244) - David Ng
如果我的模型中有多个网络和多个输入,并且其中一些输入的形状类似于torch.Size([20,16000]),该怎么办? - monart

6
for layer in model.children():
    if hasattr(layer, 'out_features'):
        print(layer.out_features)

6

与David Ng的答案类似,但更简洁:

def get_output_shape(model, image_dim):
    return model(torch.rand(*(image_dim))).data.shape

在这个例子中,我需要确定最后一个线性层的输入:
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.expected_input_shape = (1, 1, 192, 168)
        self.conv1 = nn.Conv2d(1, 32, 3, 1) 
        self.conv2 = nn.Conv2d(32, 64, 3, 1)
        self.dropout1 = nn.Dropout2d(0.25)
        self.dropout2 = nn.Dropout2d(0.5)
        self.maxpool1 = nn.MaxPool2d(2)
        self.maxpool2 = nn.MaxPool2d(3)

        # Calculate the input of the Linear layer
        conv1_out = get_output_shape(self.maxpool1, get_output_shape(conv1, self.expected_input_shape))
        conv2_out = get_output_shape(self.maxpool2, get_output_shape(conv2, conv1_out)) 
        fc1_in = np.prod(list(conv2_out)) # Flatten

        self.fc1 = nn.Linear(fc1_in, 38)

    def forward(self, x):
        x = self.conv1(x) 
        x = F.relu(x)
        x = self.maxpool1(x) 
        x = self.conv2(x)
        x = F.relu(x)
        x = self.maxpool2(x) 
        x = self.dropout1(x) 
        x = torch.flatten(x, 1) # flatten to a single dimension
        x = self.fc1(x) 
        output = F.log_softmax(x, dim=1) 
        return output

这样做的好处是,如果我对前面的层进行更改,就不必再重新计算一遍了!

我的答案基于这个回答


3
另一种获取在nn.Sequential容器中特定层后大小的方法是添加一个自定义的Module,它只打印输入的大小。
class PrintSize(nn.Module):
  def __init__(self):
    super(PrintSize, self).__init__()
    
  def forward(self, x):
    print(x.shape)
    return x

现在你可以这样做:

model = nn.Sequential(
    nn.Conv2d(3, 10, 5, 1),
    // lots of convolutions, pooling, etc.
    nn.Flatten(),
    PrintSize(),
    nn.Linear(1, 12), // the input dim of 1 is just a placeholder
) 

现在,你可以执行 model(x) 命令,它会打印出在 Conv2d 层运行后的输出形状。如果你有很多卷积操作,并且想弄清楚第一个全连接层的最终维度是什么,这非常有用。你不需要将你的nn.Sequential格式化为一个Module,只需使用这个帮助类并添加一行代码即可。

如果我们不想改变现有的模型,那么使用layer.register_forward_hook会更好。 - x4444

1

这里有一个辅助函数的解决方案:

def get_tensor_dimensions_impl(model, layer, image_size, for_input=False):
    t_dims = None
    def _local_hook(_, _input, _output):
        nonlocal t_dims
        t_dims = _input[0].size() if for_input else _output.size()
        return _output    
    layer.register_forward_hook(_local_hook)
    dummy_var = torch.zeros(1, 3, image_size, image_size)
    model(dummy_var)
    return t_dims

例子:

from torchvision import models, transforms

a_model = models.squeezenet1_0(pretrained=True) 
get_tensor_dimensions_impl(a_model, a_model._modules['classifier'], 224)

输出为:

torch.Size([1, 1000, 1, 1])


1
受到@minggli答案的启发,我使用了模型的子模块。
如果不是所有步骤都在模型的子模块中(例如,在下面的示例中,torch.flatten调用在ResNet18模型的forward方法中,但不在模型的子模块列表中),则需要对此方法进行一些修改步骤。
device = "cuda" # if you want to put on gpu
model = torchvision_models.resnet18(weights="IMAGENET1K_V1")
model.to(device)
batch_size = 4
n_bands = 3
n_rows = 224
n_cols = 224
ex_input = torch.ones((batch_size, n_bands, n_rows, n_cols), device=device)

for i,c in enumerate(list(model.children())):
    # per source code it looks like there is a flatten call which is not in 
    # model.children() & so will need to be added here to get this method to work
    if i == 0:
        layer_output = c(ex_input)
    else:
        if c.__str__() == 'Linear(in_features=512, out_features=10, bias=True)':
            print('layer found')
            print("\t before flatten", layer_output.shape)
            layer_output = torch.flatten(layer_output, 1)
            print("\t after flatten", layer_output.shape)

        layer_output = c(layer_output)

    print(f"children index {i}: {c}")
    print("\t",layer_output.shape)
    print()

0
这是我写的一个函数:
total_output_elements = 0
def calc_total_activation_size(model, call_the_network_function):
    global total_output_elements
    total_output_elements = 0

    def hook(module, input, output):
        global total_output_elements
        total_output_elements += output.numel()
        
    handle = torch.nn.modules.module.register_module_forward_hook(hook)
    result = call_the_network_function()
    handle.remove()
    return result, total_output_elements

在钩子函数中,您可以打印输出的形状。
参数: model - 您的模型 call_the_network_function - 调用模型前向传递并返回结果的函数。

-1
也许你可以尝试 print(model.state_dict()['next_layer.weight'].shape)。 这将为您提供来自最后一层的输出形状提示。

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接