我目前正在使用这个仓库进行NLP并学习如何使用我的自有数据集进行CNN,但是我一直遇到一个形状不匹配的错误:
ValueError: Target size (torch.Size([64])) must be the same as input size (torch.Size([15]))
10 }
11 for epoch in tqdm(range(params['epochs'])):
---> 12 train_loss, train_acc = train(model, train_iterator, optimizer, criterion)
13 valid_loss, valid_acc = evaluate(model, valid_iterator, criterion)
14 epoch_mins, epoch_secs = epoch_time(start_time, end_time)
57 print("PredictionShapeAfter:")
58 print(predictions.shape)
---> 59 loss = criterion(predictions, batch.l)
60
61 acc = binary_accuracy(predictions, batch.l)
做了一些调查,我发现我的卷积神经网络(CNN)的预测结果与它所比较的训练数据的真实值不同大小。
Input Shape:
torch.Size([15, 64])
Truth Shape:
torch.Size([64])
embedded unsqueezed: torch.Size([15, 1, 64, 100])
cat shape: torch.Size([15, 300])
Prediction Shape Before Squeeze:
torch.Size([15, 1])
PredictionShapeAfter:
torch.Size([15])
该模型将预测形状(此列表中的最后一个值)作为输入的第一维。这是常见问题吗?有没有办法纠正这个问题?
我的模型:
class CNN(nn.Module):
def __init__(self, vocab_size, embedding_dim, n_filters, filter_sizes, output_dim,
dropout, pad_idx):
super().__init__()
self.embedding = nn.Embedding(vocab_size, embedding_dim, padding_idx = pad_idx)
self.convs = nn.ModuleList([
nn.Conv2d(in_channels = 1,
out_channels = n_filters,
kernel_size = (fs, embedding_dim))
for fs in filter_sizes
])
self.fc = nn.Linear(len(filter_sizes) * n_filters, output_dim)
self.dropout = nn.Dropout(dropout)
def forward(self, text):
embedded = self.embedding(text)
embedded = embedded.unsqueeze(1)
print(f"embedded unsqueezed: {embedded.shape}")
conved = [F.relu(conv(embedded)).squeeze(3) for conv in self.convs]
pooled = [F.max_pool1d(conv, conv.shape[2]).squeeze(2) for conv in conved]
cat = self.dropout(torch.cat(pooled, dim = 1))
print(f"cat shape: {cat.shape}")
return self.fc(cat)
我的训练函数:
def train(model, iterator, optimizer, criterion):
epoch_loss = 0
epoch_acc = 0
model.train()
for batch in iterator:
optimizer.zero_grad()
print("InputShape:")
print(batch.t.shape)
print("Truth Shape:")
print(batch.l.shape)
predictions = model(batch.t)
print("Prediction Shape Before Squeeze:")
print(predictions.shape)
predictions = predictions.squeeze(1)
print("PredictionShapeAfter:")
print(predictions.shape)
loss = criterion(predictions, batch.l)
acc = binary_accuracy(predictions, batch.l)
loss.backward()
optimizer.step()
epoch_loss += loss.item()
epoch_acc += acc.item()
return epoch_loss / len(iterator), epoch_acc / len(iterator)
我的完整代码可以在这个链接找到。
torch.size([15, 64])
,那么考虑到每个样本都有一个标签,你的标签不应该是torch.size([15])
吗? - Sean