Keras数据增强参数

Question

Keras数据增强参数

5

我阅读了一些关于Keras数据增强的材料，但对我来说仍然有些模糊。在数据增强步骤中，是否有任何参数可以控制从每个输入图像创建的图像数量？在此示例中，我看不到任何控制从每个图像创建的图像数量的参数。

例如，在下面的代码中，我可以有一个参数（num_imgs）来控制从每个输入图像创建并存储在名为预览的文件夹中的图像数量；但在实时数据增强中，没有任何参数用于此目的。

from keras.preprocessing.image import ImageDataGenerator, array_to_img, img_to_array, load_img
num_imgs = 20
datagen = ImageDataGenerator(
        rotation_range=40,
        width_shift_range=0.2,
        height_shift_range=0.2,
        shear_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True,
        fill_mode='nearest')

img = load_img('data/train/cats/cat.0.jpg')  # this is a PIL image
x = img_to_array(img)  # this is a Numpy array with shape (3, 150, 150)
x = x.reshape((1,) + x.shape)  # this is a Numpy array with shape (1, 3, 150, 150)

# the .flow() command below generates batches of randomly transformed images
# and saves the results to the `preview/` directory
i = 0
for batch in datagen.flow(x, batch_size=1,
                          save_to_dir='preview', save_prefix='cat', save_format='jpeg'):
    i += 1
    if i > num_imgs:
        break  # otherwise the generator would loop indefinitely

- SaraG

2个回答

3

这里基本的工作原理如下，它只为每个输入图像生成一张图像，在所有输入图像都被生成一次后，它将重新开始。在您的例子中，因为总共只有一个输入图像，它会重复生成该图像的不同版本，直到达到20张。

您可以在此查看源代码：https://github.com/fchollet/keras/blob/master/keras/preprocessing/image.py

- dontloo

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Sergii Gryshkevych · Accepted Answer

数据增强的工作原理如下：在每个学习时期，使用指定范围内随机选择的参数对训练集中的所有原始图像应用变换。完成一个时期后（即在将学习算法暴露于整个训练数据集之后），开始下一个学习时期，并通过对原始训练数据应用指定的转换来再次增广训练数据。

这样，每个图像被增广的次数等于学习时期的次数。回想一下，你提供的链接示例中的内容：

# Fit the model on the batches generated by datagen.flow().
model.fit_generator(datagen.flow(X_train, Y_train,
                    batch_size=batch_size),
                    samples_per_epoch=X_train.shape[0],
                    nb_epoch=nb_epoch,
                    validation_data=(X_test, Y_test))

这里的datagen对象将会训练集数据提供给model进行nb_epoch次数的训练，因此每张图片会被增强nb_epoch次。通过这种方式，学习算法几乎不会看到两个完全相同的训练样例，因为在每一轮训练中，训练样例都会随机变换。