Tensorflow目标检测API训练问题

4

我正在使用Paperspace进行训练,但遇到了一些之前没有遇到过的问题。我之前使用同样的机器没有任何问题,但现在训练似乎根本没有开始。我已将批次大小减小到10(默认为24)。

有其他人遇到过这个问题吗?

当我在models/research/object_detection中运行train.py时,这是我得到的输出,已经运行了大约一个小时。

WARNING:tensorflow:From /home/paperspace/Documents/models/research/object_detection/trainer.py:210: create_global_step (from tensorflow.contrib.framework.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Please switch to tf.train.create_global_step
INFO:tensorflow:depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
INFO:tensorflow:Summary name /clone_loss is illegal; using clone_loss instead.
2017-11-27 12:08:46.994554: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2
2017-11-27 12:08:47.109823: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:892] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2017-11-27 12:08:47.110204: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties: 
name: Quadro P4000 major: 6 minor: 1 memoryClockRate(GHz): 1.48
pciBusID: 0000:00:05.0
totalMemory: 7.92GiB freeMemory: 7.60GiB
2017-11-27 12:08:47.110230: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: Quadro P4000, pci bus id: 0000:00:05.0, compute capability: 6.1)
INFO:tensorflow:Restoring parameters from ssd_mobilenet_v1_coco_11_06_2017/model.ckpt
INFO:tensorflow:Starting Session.
INFO:tensorflow:Saving checkpoint to path training/model.ckpt
INFO:tensorflow:Starting Queues.
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:Saving checkpoint to path training/model.ckpt
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:Saving checkpoint to path training/model.ckpt
1个回答

0

我认为你没有生成tf record文件,请在research文件夹中检查是否存在用于训练和测试的generatetf.record文件。 如果它不存在,请首先生成它,然后从训练文件夹中删除除了你的模型(faster_rcnn)和label.pbtxt文件以外的所有文件,然后开始训练!


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接