Tensorflow目标检测API：从导出的模型检查点进行训练

Question

Tensorflow目标检测API：从导出的模型检查点进行训练

tensorflowmachine-learningcomputer-visiontensorflow2.0object-detection

3

我之前导出了一个RetinaNet模型（最初来自于对象检测动物园），并使用Tensorflow Object Detection API（Tensorflow版本2.4.1）对自定义数据集进行了微调。以下是导出模型文件夹的外观。

当在模型上运行评估时（如下所示），它的mAP@0.5IOU为0.5。 python model_main_tf2.py --model_dir=exported-models/retinanet --pipeline_config_path=exported-models/retinanet/pipeline.config --checkpoint_dir=exported-models/retinanet/checkpoint

问题：

由于不幸的情况，我没有模型训练时的训练文件夹。最近我得到了更多的数据，想要将导出的模型作为进一步训练的起点，并在新的训练中设置fine_tune_checkpoint: "exported-models/retinanet/checkpoint/ckpt-0" 在pipeline.config中。

  fine_tune_checkpoint: "exported-models/retinanet/checkpoint/ckpt-0"
  num_steps: 25000
  startup_delay_steps: 0.0
  replicas_to_aggregate: 8
  max_number_of_boxes: 100
  unpad_groundtruth_tensors: false
  fine_tune_checkpoint_type: "detection"
  use_bfloat16: false
  fine_tune_checkpoint_version: V2

然而，当使用model_main_tf2.py脚本开始训练时，第一个检查点（在步骤0处）得分很差 - 即使在导出模型运行评估的相同数据集上也是如此。

我期望第一个检查点在相同的测试集上具有与导出模型相同的得分。如果这种假设是错误的，那么为什么？

- sune

这个问答非常赞！我花了好几天时间才解决同样的问题。非常感谢！ - zelda26

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- sune · Accepted Answer

我终于在这里找到了以下内容：

// Whether to load all checkpoint vars that match model variable names and
// sizes. This option is only available if `from_detection_checkpoint` is
// True.  This option is *not* supported for TF2 --- setting it to true
// will raise an error. **Instead, set fine_tune_checkpoint_type: 'full'.**
  optional bool load_all_detection_checkpoint_vars = 19 [default = false];

将fine_tune_checkpoint_type设置为"full"，我得到了第一个检查点（0步）的正确mAP。