使用Keras反转VGG的图像预处理以还原原始图像

Question

使用Keras反转VGG的图像预处理以还原原始图像

5

我正在使用keras应用程序中的VGG19模型。我原本期望图像被缩放到[-1, 1]，但似乎preprocess_input函数做了其他处理。

为了预处理输入，我使用以下2行代码来加载并缩放图像：

from keras.preprocessing import image
from keras.applications.vgg19 import preprocess_input

img = image.load_img("./img.jpg", target_size=(256, 256))
img = preprocess_input(np.array(img))

print(img)
>>> array([[[151.061  , 138.22101, 131.32   ],
    ... ]]]

输出似乎在[0,255]区间内，但是原来的255被映射到了大约151左右的值（可能是中心化）。VGG实际需要的输入是什么？根据源代码（对于mode='tf'），我认为它应该在[-1,1]范围内。它非常灵活，我可以使用任何我想要的缩放吗？（我正在使用VGG提取中层特征-Conv4块）。

当查看preprocess_input的源代码时，我看到：

...
    if mode == 'tf':
        x /= 127.5
        x -= 1.
        return x
...

这段话意味着，在使用Keras时，若使用Tensorflow后端，则应该将其缩放至[-1,1]范围内。

我需要创建一个名为restore_original_image_from_array()的函数，可以接收img并且重新构建出之前输入的原始图像。问题在于我不确定VGG19的缩放方式。

总之，我的需求是：

img = image.load_img("./img.jpg", target_size=(256, 256))
scaled_img = preprocess_input(np.array(img))
restore_original_image_from_array(scaled_img) == np.array(img)
>>> True

- GRS

如果您正在使用preprocess_input()函数，那么缩放范围很明显是[-1, 1]吧？如果您看到的输出范围是[0,255]，那么您一定在使用与您发布的不同的函数。 - Gabriel Ibagon

@GabrielIbagon 这就是问题所在。这也是我预期的，但实际上它以其他方式进行了缩放。array([[[255, 255, 255], ... 被映射为 array([[[151.061 , 138.22101, 131.32 ], ... 。 - GRS

2个回答

1

VGG网络是使用每个通道标准化为mean=[103.939, 116.779, 123.68]并使用BGR通道训练的图像。此外，由于我们优化的图像可能在-∞和∞之间取任意值，因此我们必须进行剪裁以保持值在0-255范围内。
以下是“去处理”或反向处理已处理图像的代码：

def deprocess_img(processed_img):
  x = processed_img.copy()
  if len(x.shape) == 4:
    x = np.squeeze(x, 0)
  assert len(x.shape) == 3, ("Input to deprocess image must be an image of "
                             "dimension [1, height, width, channel] or [height, width, channel]")
  if len(x.shape) != 3:
    raise ValueError("Invalid input to deprocessing image")
  
  # perform the inverse of the preprocessiing step
  x[:, :, 0] += 103.939
  x[:, :, 1] += 116.779
  x[:, :, 2] += 123.68
  x = x[:, :, ::-1]

  x = np.clip(x, 0, 255).astype('uint8')
  return x

- Sanchit Vijay

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Gabriel Ibagon · Accepted Answer

preprocess_input 函数的“模式”取决于预训练网络权重所在的框架。Keras 中的 VGG19 网络使用来自 caffe 的原始 VGG19 模型的权重，因此在 preprocess_input 中的参数应该是默认值（mode='caffe'）。请参阅这个问题：Keras VGG16 preprocess_input modes 为了您的目的，请使用位于 keras.applications.vgg19 中的 preprocess_input 函数，并从那里进行反向工程。

原始预处理程序在此处找到：https://github.com/keras-team/keras-applications/blob/master/keras_applications/imagenet_utils.py#L21 这涉及以下步骤：1) 将图像从 RGB 转换为 BGR 2) 从图像中减去数据集均值

以下是将原始图像还原的代码：

def restore_original_image_from_array(x, data_format='channels_first'):
    mean = [103.939, 116.779, 123.68]

    # Zero-center by mean pixel
    if data_format == 'channels_first':
        if x.ndim == 3:
            x[0, :, :] += mean[0]
            x[1, :, :] += mean[1]
            x[2, :, :] += mean[2]
        else:
            x[:, 0, :, :] += mean[0]
            x[:, 1, :, :] += mean[1]
            x[:, 2, :, :] += mean[2]
    else:
        x[..., 0] += mean[0]
        x[..., 1] += mean[1]
        x[..., 2] += mean[2]

    if data_format == 'channels_first':
        # 'BGR'->'RGB'
        if x.ndim == 3:
            x = x[::-1, ...]
        else:
            x = x[:, ::-1, ...]
    else:
        # 'BGR'->'RGB'
        x = x[..., ::-1]

    return x