Tensorflow目标检测API 1通道图像

Question

Tensorflow目标检测API 1通道图像

tensorflowobject-detectiondepth

4

有没有办法在Tensorflow Object Detection API中使用针对RGB图像训练的预训练模型，来检测单通道灰度图像（深度）？

- Daulet Baimukashev

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Surya Tej · Accepted Answer

我尝试使用预训练模型（faster_rcnn_resnet101_coco_11_06_2017）在Tensorflow上执行灰度图像（1通道图像）的物体检测，以下是我的尝试：

模型是在RGB图像上进行训练的，所以我只需修改Tensorflow Repo中object_detection_tutorial.ipynb中的某些代码即可。

第一个更改：请注意，ipynb中的现有代码是针对3通道图像编写的，因此请按照以下方式更改load_image_into_numpy数组函数

从：

def load_image_into_numpy_array(image):
  (im_width, im_height) = image.size
  return np.array(image.getdata()).reshape(
      (im_height, im_width, 3)).astype(np.uint8)

到

def load_image_into_numpy_array(image):
  (im_width, im_height) = image.size
  channel_dict = {'L':1, 'RGB':3} # 'L' for Grayscale, 'RGB' : for 3 channel images
  return np.array(image.getdata()).reshape(
      (im_height, im_width, channel_dict[image.mode])).astype(np.uint8)

第二个更改：灰度图像仅具有1个通道的数据。为了执行对象检测，我们需要3个通道（推理代码是针对3个通道编写的）。

可以通过两种方式实现。 a）将单通道数据复制到另外两个通道 b）用零填充其他两个通道。它们都能起作用，我使用了第一种方法。

在ipynb中，转到读取图像并将其转换为numpy数组的部分（ipynb末尾的for循环）。

将代码更改为：

for image_path in TEST_IMAGE_PATHS:
  image = Image.open(image_path)
  # the array based representation of the image will be used later in order to prepare the
  # result image with boxes and labels on it.
  image_np = load_image_into_numpy_array(image)
  # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
  image_np_expanded = np.expand_dims(image_np, axis=0)

到这个：

for image_path in TEST_IMAGE_PATHS:
  image = Image.open(image_path)
  # the array based representation of the image will be used later in order to prepare the
  # result image with boxes and labels on it.
  image_np = load_image_into_numpy_array(image)
  if image_np.shape[2] != 3:  
      image_np = np.broadcast_to(image_np, (image_np.shape[0], image_np.shape[1], 3)).copy() # Duplicating the Content
      ## adding Zeros to other Channels
      ## This adds Red Color stuff in background -- not recommended 
      # z = np.zeros(image_np.shape[:-1] + (2,), dtype=image_np.dtype)
      # image_np = np.concatenate((image_np, z), axis=-1)
  # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
  image_np_expanded = np.expand_dims(image_np, axis=0)

就是这样，运行文件后你应该能看到结果。这些是我的结果。