在TensorFlow教程中,提供了使用TFRecords处理MNIST数据集的示例。将MNIST数据集转换为TFRecords文件的方法如下:
然后它会被读取并解码,如下所示:
所有张量的大小都需要被知道。这意味着我不能像这样做:
def convert_to(data_set, name):
images = data_set.images
labels = data_set.labels
num_examples = data_set.num_examples
if images.shape[0] != num_examples:
raise ValueError('Images size %d does not match label size %d.' %
(images.shape[0], num_examples))
rows = images.shape[1]
cols = images.shape[2]
depth = images.shape[3]
filename = os.path.join(FLAGS.directory, name + '.tfrecords')
print('Writing', filename)
writer = tf.python_io.TFRecordWriter(filename)
for index in range(num_examples):
image_raw = images[index].tostring()
example = tf.train.Example(features=tf.train.Features(feature={
'height': _int64_feature(rows),
'width': _int64_feature(cols),
'depth': _int64_feature(depth),
'label': _int64_feature(int(labels[index])),
'image_raw': _bytes_feature(image_raw)}))
writer.write(example.SerializeToString())
writer.close()
然后它会被读取并解码,如下所示:
def read_and_decode(filename_queue):
reader = tf.TFRecordReader()
_, serialized_example = reader.read(filename_queue)
features = tf.parse_single_example(
serialized_example,
# Defaults are not specified since both keys are required.
features={
'image_raw': tf.FixedLenFeature([], tf.string),
'label': tf.FixedLenFeature([], tf.int64),
})
# Convert from a scalar string tensor (whose single string has
# length mnist.IMAGE_PIXELS) to a uint8 tensor with shape
# [mnist.IMAGE_PIXELS].
image = tf.decode_raw(features['image_raw'], tf.uint8)
image.set_shape([mnist.IMAGE_PIXELS])
# OPTIONAL: Could reshape into a 28x28 image and apply distortions
# here. Since we are not applying any distortions in this
# example, and the next step expects the image to be flattened
# into a vector, we don't bother.
# Convert from [0, 255] -> [-0.5, 0.5] floats.
image = tf.cast(image, tf.float32) * (1. / 255) - 0.5
# Convert label from a scalar uint8 tensor to an int32 scalar.
label = tf.cast(features['label'], tf.int32)
return image, label
问题: 是否有一种方法可以从具有不同尺寸的TFRecords中读取图像?因为在目前的情况下
image.set_shape([mnist.IMAGE_PIXELS])
所有张量的大小都需要被知道。这意味着我不能像这样做:
width = tf.cast(features['width'], tf.int32)
height = tf.cast(features['height'], tf.int32)
tf.reshape(image, [width, height, 3])
那么在这种情况下,我该如何使用TFRecords呢? 此外,我不明白为什么在教程中作者们会将图像的高度和宽度保存在TFRecords文件中,但在读取和解码图像时却没有使用它们,而是使用预定义的常量。