数值错误:不能取未知秩的形状的长度。

6

我正在尝试将我们的输入管道移植到TensorFlow Dataset API。为此,我们已经将图像和标签转换为tfrecords格式。然后,我们通过数据集API读取tfrecords并比较初始数据和读取的数据是否相同。目前为止一切顺利。以下是将tfrecords读入数据集的代码:

def _parse_function2(proto):

    # define your tfrecord again. Remember that you saved your image as a string.

    keys_to_features = {"im_path": tf.FixedLenSequenceFeature([], tf.string, allow_missing=True),
                        "im_shape": tf.FixedLenSequenceFeature([], tf.int64, allow_missing=True),
                        "score_shape": tf.FixedLenSequenceFeature([], tf.int64, allow_missing=True),
                        "geo_shape": tf.FixedLenSequenceFeature([], tf.int64, allow_missing=True),
                        "im_patches": tf.FixedLenSequenceFeature([], tf.string, allow_missing=True),
                        "score_patches": tf.FixedLenSequenceFeature([], tf.string, allow_missing=True),
                        "geo_patches": tf.FixedLenSequenceFeature([], tf.string, allow_missing=True),
                        }

    # Load one example
    parsed_features = tf.parse_single_example(serialized=proto, features=keys_to_features)

    parsed_features['im_patches'] = parsed_features['im_patches'][0]
    parsed_features['score_patches'] = parsed_features['score_patches'][0]
    parsed_features['geo_patches'] = parsed_features['geo_patches'][0]

    parsed_features['im_patches'] = tf.decode_raw(parsed_features['im_patches'], tf.uint8)
    parsed_features['im_patches'] = tf.reshape(parsed_features['im_patches'], parsed_features['im_shape'])

    parsed_features['score_patches'] = tf.decode_raw(parsed_features['score_patches'], tf.uint8)
    parsed_features['score_patches'] = tf.reshape(parsed_features['score_patches'], parsed_features['score_shape'])

    parsed_features['geo_patches'] = tf.decode_raw(parsed_features['geo_patches'], tf.int16)
    parsed_features['geo_patches'] = tf.reshape(parsed_features['geo_patches'], parsed_features['geo_shape'])

    return parsed_features['im_patches'], tf.cast(parsed_features["score_patches"],tf.int16), parsed_features["geo_patches"]



def create_dataset2(tfrecord_path):
    # This works with arrays as well
    dataset = tf.data.TFRecordDataset([tfrecord_path], compression_type="ZLIB")

    # Maps the parser on every filepath in the array. You can set the number of parallel loaders here
    dataset = dataset.map(_parse_function2, num_parallel_calls=8)

    # This dataset will go on forever
    dataset = dataset.repeat()

    # Set the batchsize
    dataset = dataset.batch(1)

    return dataset

现在,使用上述函数创建的数据集将按以下方式传递给model.fit方法。 我正在遵循GitHub Gist的示例,了解如何将数据集传递到model.fit中。
train_tfrecord = 'data/tfrecords/train/train.tfrecords'
test_tfrecord = 'data/tfrecords/test/test.tfrecords'

train_dataset  = create_dataset2(train_tfrecord)
test_dataset  = create_dataset2(test_tfrecord)


model.fit(
    train_dataset.make_one_shot_iterator(),
    steps_per_epoch=5,
    epochs=10,
    shuffle=True,
    validation_data=test_dataset.make_one_shot_iterator(),
    callbacks=[function1, function2, function3],
    verbose=1)

但我在上面的model.fit函数调用处得到了“ValueError:Cannot take the length of Shape with unknown rank.”的错误。
编辑1: 我正在使用以下代码循环遍历数据集,并提取张量的秩、形状和类型。
train_tfrecord = 'data/tfrecords/train/train.tfrecords'

with tf.Graph().as_default():

    # Deserialize and report on the fake data
    sess = tf.Session()
    sess.run(tf.global_variables_initializer())

    dataset = tf.data.TFRecordDataset([train_tfrecord], compression_type="ZLIB")
    dataset = dataset.map(_parse_function2)

    iterator = dataset.make_one_shot_iterator()


    while True:
        try:
            next_element = iterator.get_next()
            im_patches, score_patches, geo_patches = next_element

            rank_im_shape = tf.rank(im_patches)
            rank_score_shape = tf.rank(score_patches)
            rank_geo_shape = tf.rank(geo_patches)


            shape_im_shape = tf.shape(im_patches)
            shape_score_shape = tf.shape(score_patches)
            shape_geo_shape = tf.shape(geo_patches)

            [ some_imshape, some_scoreshape, some_geoshape,\
             some_rank_im_shape, some_rank_score_shape, some_rank_geo_shape,
             some_shape_im_shape, some_shape_score_shape, some_shape_geo_shape] = \
                sess.run([ im_patches, score_patches, geo_patches,
                          rank_im_shape, rank_score_shape, rank_geo_shape,
                          shape_im_shape, shape_score_shape, shape_geo_shape])



            print("Rank of the 3 patches ")
            print(some_rank_im_shape)
            print(some_rank_score_shape)
            print(some_rank_geo_shape)

            print("Shapes of the 3 patches ")
            print(some_shape_im_shape)
            print(some_shape_score_shape)
            print(some_shape_geo_shape)

            print("Types of the 3 patches ")
            print(type(im_patches))
            print(type(score_patches))
            print(type(geo_patches))

        except tf.errors.OutOfRangeError:
            break

以下是这两个tfrecords的输出结果。
Rank of the 3 patches 
4
4
4
Shapes of the 3 patches 
[   1 3553 2529    3]
[   1 3553 2529    2]
[   1 3553 2529    5]
Types of the 3 patches 
<class 'tensorflow.python.framework.ops.Tensor'>
<class 'tensorflow.python.framework.ops.Tensor'>
<class 'tensorflow.python.framework.ops.Tensor'>
Rank of the 3 patches 
4
4
4
Shapes of the 3 patches 
[   1 3553 5025    3]
[   1 3553 5025    2]
[   1 3553 5025    5]
Types of the 3 patches 
<class 'tensorflow.python.framework.ops.Tensor'>
<class 'tensorflow.python.framework.ops.Tensor'>
<class 'tensorflow.python.framework.ops.Tensor'>

我发现的一件事是,如果我试图将多个标签作为列表返回并比较上面迭代器脚本返回的值,就会出现错误。
def _parse_function2(proto):

    ---- everything same as above ----
    ---- just returning the labels as list---


    return parsed_features['im_patches'], [tf.cast(parsed_features["score_patches"],tf.int16), parsed_features["geo_patches"]]

将上述返回值作为以下内容捕获:
    while True:
        try:
            next_element = iterator.get_next()
            im_patches, [score_patches, geo_patches] = next_element

错误如下:TypeError: 当启用急切执行时,张量对象仅可迭代。要迭代此张量,请使用tf.map_fn。 编辑2:这里是fit函数的定义。似乎它可以使用tensorflow的张量以及steps_per_epoch
def fit(self,
      x=None,
      y=None,
      batch_size=None,
      epochs=1,
      verbose=1,
      callbacks=None,
      validation_split=0.,
      validation_data=None,
      shuffle=True,
      class_weight=None,
      sample_weight=None,
      initial_epoch=0,
      steps_per_epoch=None,
      validation_steps=None,
      max_queue_size=10,
      workers=1,
      use_multiprocessing=False,
      **kwargs):
"""Trains the model for a fixed number of epochs (iterations on a dataset).

Arguments:
    x: Input data. It could be:
      - A Numpy array (or array-like), or a list of arrays
        (in case the model has multiple inputs).
      - A TensorFlow tensor, or a list of tensors
        (in case the model has multiple inputs).
      - A dict mapping input names to the corresponding array/tensors,
        if the model has named inputs.
      - A `tf.data` dataset or a dataset iterator. Should return a tuple
        of either `(inputs, targets)` or
        `(inputs, targets, sample_weights)`.
      - A generator or `keras.utils.Sequence` returning `(inputs, targets)`
        or `(inputs, targets, sample weights)`.
    y: Target data. Like the input data `x`,
      it could be either Numpy array(s) or TensorFlow tensor(s).
      It should be consistent with `x` (you cannot have Numpy inputs and
      tensor targets, or inversely). If `x` is a dataset, dataset
      iterator, generator, or `keras.utils.Sequence` instance, `y` should
      not be specified (since targets will be obtained from `x`).

迭代器是否输出具有定义形状的numpy数组?请打印迭代器的每个输出的类型和形状。 - Daniel Möller
好的... Keras的生成器不能接收张量。它们必须接收numpy数组。 - Daniel Möller
@DanielMöller:这个问题与我的问题相关吗?https://dev59.com/k1gQ5IYBdhLWcg3w0HIA - user238607
我对那个代码片段感到困惑... fit 不应该使用 steps_per_epoch 或迭代器。唯一接受生成器的方法是 fit_generator。这让我觉得这个例子可能有缺陷。 - Daniel Möller
@DanielMöller:我已经发布了fit函数的定义。根据定义,它应该能够接受张量作为输入以及steps_per_epoch。 - user238607
显示剩余2条评论
1个回答

4

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接