Tensorflow目标检测API训练问题

Question

Tensorflow目标检测API训练问题

4

我正在使用Paperspace进行训练，但遇到了一些之前没有遇到过的问题。我之前使用同样的机器没有任何问题，但现在训练似乎根本没有开始。我已将批次大小减小到10（默认为24）。

有其他人遇到过这个问题吗？

当我在models/research/object_detection中运行train.py时，这是我得到的输出，已经运行了大约一个小时。

WARNING:tensorflow:From /home/paperspace/Documents/models/research/object_detection/trainer.py:210: create_global_step (from tensorflow.contrib.framework.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Please switch to tf.train.create_global_step
INFO:tensorflow:depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
INFO:tensorflow:Summary name /clone_loss is illegal; using clone_loss instead.
2017-11-27 12:08:46.994554: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2
2017-11-27 12:08:47.109823: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:892] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2017-11-27 12:08:47.110204: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties: 
name: Quadro P4000 major: 6 minor: 1 memoryClockRate(GHz): 1.48
pciBusID: 0000:00:05.0
totalMemory: 7.92GiB freeMemory: 7.60GiB
2017-11-27 12:08:47.110230: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: Quadro P4000, pci bus id: 0000:00:05.0, compute capability: 6.1)
INFO:tensorflow:Restoring parameters from ssd_mobilenet_v1_coco_11_06_2017/model.ckpt
INFO:tensorflow:Starting Session.
INFO:tensorflow:Saving checkpoint to path training/model.ckpt
INFO:tensorflow:Starting Queues.
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:Saving checkpoint to path training/model.ckpt
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:Saving checkpoint to path training/model.ckpt

- jToTheG

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- VIKASH SINGH · Answer 1

我认为你没有生成tf record文件，请在research文件夹中检查是否存在用于训练和测试的generatetf.record文件。如果它不存在，请首先生成它，然后从训练文件夹中删除除了你的模型（faster_rcnn）和label.pbtxt文件以外的所有文件，然后开始训练！