我按照这个教程进行了目标检测:
https://pytorch.org/tutorials/intermediate/torchvision_tutorial.html,
以及他们包含以下
train_one_epoch
和evaluate
函数的GitHub存储库:
https://github.com/pytorch/vision/blob/main/references/detection/engine.py。
然而,我想在验证过程中计算损失。 我为评估损失实现了此操作,在这种情况下,需要打开model.train()
以获取损失。@torch.no_grad()
def evaluate_loss(model, data_loader, device):
val_loss = 0
model.train()
for images, targets in data_loader:
images = list(image.to(device) for image in images)
targets = [{k: v.to(device) for k, v in t.items()} for t in targets]
loss_dict = model(images, targets)
losses = sum(loss for loss in loss_dict.values())
# reduce losses over all GPUs for logging purposes
loss_dict_reduced = utils.reduce_dict(loss_dict)
losses_reduced = sum(loss for loss in loss_dict_reduced.values())
val_loss += losses_reduced
validation_loss = val_loss/ len(data_loader)
return validation_loss
然后我会将它放在学习率调度器步骤之后,在我的循环中:
for epoch in range(args.num_epochs):
# train for one epoch, printing every 10 iterations
train_one_epoch(model, optimizer, train_data_loader, device, epoch, print_freq=10)
# update the learning rate
lr_scheduler.step()
validation_loss = evaluate_loss(model, valid_data_loader, device=device)
# evaluate on the test dataset
evaluate(model, valid_data_loader, device=device)
这看起来正确吗?会不会干扰训练或产生不准确的验证损失?
如果可以,使用这个方法,是否有一种简单的方式应用早期停止验证损失?
我正在考虑在上面显示的评估模型函数之后添加类似于以下内容:
torch.save({
'epoch': epoch,
'model_state_dict': net.state_dict(),
'optimizer_state_dict': optimizer.state_dict(),
'validation loss': valid_loss,
}, PATH)
我也希望在每个epoch保存模型以进行检查点操作。然而,我需要确定验证“损失”以保存“最佳”模型。
model.train
上进行训练,否则这些层将在评估期间更新/开启(分别)。 - jhso