运行时错误：期望输入为4维权重32 3 3 的4维输入，但实际得到的是尺寸为[3, 224, 224]的3维输入。

Question

运行时错误：期望输入为4维权重32 3 3 的4维输入，但实际得到的是尺寸为[3, 224, 224]的3维输入。

pythonmachine-learningpytorchcomputer-visionconv-neural-network

34

我正在尝试使用一个预训练模型。问题出在这里：

这个模型不应该接收一个简单的彩色图像吗？为什么它需要一个四维的输入？

RuntimeError                              Traceback (most recent call last)
<ipython-input-51-d7abe3ef1355> in <module>()
     33 
     34 # Forward pass the data through the model
---> 35 output = model(data)
     36 init_pred = output.max(1, keepdim=True)[1] # get the index of the max log-probability
     37 

5 frames
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py in forward(self, input)
    336                             _pair(0), self.dilation, self.groups)
    337         return F.conv2d(input, self.weight, self.bias, self.stride,
--> 338                         self.padding, self.dilation, self.groups)
    339 
    340 

RuntimeError: Expected 4-dimensional input for 4-dimensional weight 32 3 3, but got 3-dimensional input of size [3, 224, 224] instead

Where

inception = models.inception_v3()
model = inception.to(device)

- JobHunter69

3

一个 Torch 模型通常期望输入一批图像。如果您想传递单个图像，请确保它仍然是一个单图像的批次。此外，Inception-v3 期望图像尺寸为 3X229X229，而其他 Torch 模型则期望为 3X224X224。 - asymptote

3个回答

35

根据Pytorch 卷积层文档，Conv2d 层期望输入具有以下形状

(n_samples, channels, height, width) # e.g., (1000, 1, 224, 224)

传递灰度图像的通常格式（224，224）无法正常工作。

为了获得正确的形状，您需要添加一个通道维度。您可以按照以下方式执行：

x = np.expand_dims(x, 1)      # if numpy array
tensor = tensor.unsqueeze(1)  # if torch tensor

unsqueeze()方法在指定索引处添加一个维度。结果将具有以下形状：

(1000, 1, 224, 224)

- Nicolas Gervais

3

对于灰度图像，你说得对。然而，对于需要被视为一批图像的RGB图像，可以使用.unsqueeze(0)。 - Wok

你能在这里解释一下 n_samples 吗？ - Hamza usman ghani

这是训练数据的数量，就像图像的数量一样。 - Nicolas Gervais

0

由于该模型期望一批图像，我们需要传递一个四维张量，可以按以下方式完成：

方法一：output = model(data[0:1])
方法二：output = model(data[0].unsqueeze(0))

这只会发送整个批次的第一张图片。

同样地，对于第 i 张图片，我们可以执行以下操作：

方法一：output = model(data[i:i+1])
方法二：output = model(data[i].unsqueeze(0))

- user41855

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Shai · Accepted Answer

正如Usman Ali在他的评论中写道，pytorch（以及大多数其他深度学习工具箱）期望输入的是一批图像。因此，您需要调用

output = model(data[None, ...])

在您的输入 data 中插入一个单例 "batch" 维度。

请注意，您使用的模型可能期望不同的输入尺寸 (3x229x229)，而不是 3x224x224。