调整Tensorflow/TFLearn输入/输出图像的问题

Question

调整Tensorflow/TFLearn输入/输出图像的问题

pythonmachine-learningcomputer-visionneural-networktensorflow

4

为了更深入地了解深度学习和计算机视觉，我正在一个路线检测项目上工作。我使用TFLearn作为Tensorflow的包装器。

背景：训练输入是道路图像（每个图像表示为50x50像素2D数组，其中每个元素都是从0.0到1.0的亮度值）。

训练输出是相同的形状（50x50数组），但表示标记的车道区域。基本上，“非道路”像素为0，而“道路”像素为1。

这不是固定大小的图像分类问题，而是一个从图片中检测道路与非道路像素的问题。

问题：我无法成功地将我的输入/输出形状化为TFLearn / Tensorflow所接受的方式，也不确定原因。这是我的示例代码：

# X = An array of training inputs (of shape (50 x 50)).
# Y = An array of training outputs (of shape (50 x 50)).

# "None" equals the number of samples in my training set, 50 represents
# the size of the 2D image array, and 1 represents the single channel
# (grayscale) of the image.
network = input_data(shape=[None, 50, 50, 1])

network = conv_2d(network, 50, 50, activation='relu')

# Does the 50 argument represent the output shape? Should this be 2500?
network = fully_connected(network, 50, activation='softmax')

network = regression(network, optimizer='adam', loss='categorical_crossentropy', learning_rate=0.001)

model = tflearn.DNN(network, tensorboard_verbose=1)

model.fit(X, Y, n_epoch=10, shuffle=True, validation_set=(X, Y), show_metric=True, batch_size=1)

我收到的错误在model.fit调用上，错误信息如下：ValueError: 无法为形状为 (1, 50, 50) 的张量 u'InputData/X:0' 提供形状为 '(?, 50, 50, 1)' 的值。我尝试将样本输入/输出数组减少为长度为 2500 的 1D 向量，但这会导致其他错误。我对如何处理所有这些内容有点迷茫，非常感谢任何帮助！

- Janum Trivedi

听起来像是人员问题。 - boztalay

2个回答

1

错误提示说明您有冲突的张量形状，一个大小为4，另一个大小为3。这是由于输入数据（X）不是形状为[-1,50,50,1]的原因。在将X馈入网络之前，需要对其进行重塑以获得正确的形状。

# X = An array of training inputs (of shape (50 x 50)).
# Y = An array of training outputs (of shape (50 x 50)).
# "None" equals the number of samples in my training set, 50 represents
# the size of the 2D image array, and 1 represents the single channel
# (grayscale) of the image.

X = tensorflow.reshape(X, shape[-1, 50, 50, 1])
network = input_data(shape=[None, 50, 50, 1])

network = conv_2d(network, 50, 50, activation='relu')

# Does the 50 argument represent the output shape? Should this be 2500?
network = fully_connected(network, 50, activation='softmax')

network = regression(network, optimizer='adam', loss='categorical_crossentropy', learning_rate=0.001)

model = tflearn.DNN(network, tensorboard_verbose=1)

model.fit(X, Y, n_epoch=10, shuffle=True, validation_set=(X, Y), show_metric=True, batch_size=1)

- Tom.Smith

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- R. S. Nikhil Krishna · Accepted Answer

请查看tensorflow的imageflow包装器，它将包含多个图像的numpy数组转换为.tfrecords文件，这是使用tensorflow建议的格式 https://github.com/HamedMP/ImageFlow。

您需要使用以下命令进行安装。

$ pip install imageflow

假设您的numpy数组包含一些'k'图像，名为k_images，相应的k标签（one-hot编码）存储在k_labels中，则创建名为“tfr_file.tfrecords”的.tfrecords文件就像写下以下一行一样简单。

imageflow.convert_images(k_images, k_labels, 'tfr_file')

另外，Google的Inception模型包含一个代码，可以读取文件夹中的图像，假设每个文件夹代表一个标签 https://github.com/tensorflow/models/blob/master/inception/inception/data/build_image_data.py