Keras / Tensorflow：使用tf.data.Dataset API进行预测

Question

Keras / Tensorflow：使用tf.data.Dataset API进行预测

pythontensorflowkerastensorflow-datasets

9

我正在使用Tensorflow后端的Keras构建模型来解决这个问题：https://www.kaggle.com/cfpb/us-consumer-finance-complaints（只是练习）。

我使用tf.data.Dataset API训练我的Keras模型。现在，我有一个Pandas DataFrame，df_testing，其列为complaint（字符串）和label（也是字符串）。我想要对这些新样本进行预测。我创建了一个tf.data.Dataset对象，执行预处理，创建一个迭代器，并在我的模型上调用预测：

data = df_testing["complaint"].values
labels = df_testing["label"].values

dataset = tf.data.Dataset.from_tensor_slices((data))
dataset = dataset.map(lambda x: ({'reviews': x}))
dataset = dataset.batch(self.batch_size).repeat()
dataset = dataset.map(lambda x: self.preprocess_text(x, self.data_table))
dataset = dataset.map(lambda x: x['reviews'])
dataset = dataset.make_initializable_iterator()

我的训练使用了 tf.data.Dataset，每个元素形如 ({'reviews': "电影很棒"}, "积极")，所以我在这里为预测模仿该方法。此外，我的预处理只是将字符串转换为整数的Tensor。

当我调用：

preds = model.predict(dataset)

但是我被告知我的 predict 调用失败：

ValueError: When using iterators as input to a model, you should specify the `steps` argument.

所以我将这个调用修改为：

preds = model.predict(dataset, steps=3)

但现在我收到了回复：

ValueError: Please provide data as a list or tuple of 2 elements  - input and target pair. Received Tensor("IteratorGetNext_2:0", shape=(?, 100), dtype=int32)

我在这里做错了什么？当预测时，我不应该提供一个包含 2 个元素的元组（我不需要标签）。

感谢您能提供的任何帮助！

- anon_swe

我认为这可能与您在.batch(..).repeat()之后对数据集进行后处理有关。 - Roy Shilkrot

2个回答

2

以下代码对我有效（在tensorflow 1.10.0上测试通过）： [简述] 只需插入空字典作为虚拟输入并指定步数即可：

model.predict(x={},steps=4)

完整代码：

import numpy as np
import tensorflow as tf
from tensorflow.data import Dataset
from tensorflow.keras.layers import Dense, Input
from tensorflow.keras.models import Model


# dummy data:
x = np.arange(4).reshape(-1, 1).astype('float32')
y = np.arange(5, 9).reshape(-1, 1).astype('float32')

# build the Datasets
ds_x = Dataset.from_tensor_slices(x).repeat().batch(4)
it_x = ds_x.make_one_shot_iterator()

ds_y = Dataset.from_tensor_slices(y).repeat().batch(4)
it_y = ds_y.make_one_shot_iterator()


# build compile and train the model
input_vals = Input(tensor=it_x.get_next())
output = Dense(1, activation='relu')(input_vals)
model = Model(inputs=input_vals, outputs=output)
model.compile('rmsprop', 'mse', target_tensors=[it_y.get_next()])
model.fit(steps_per_epoch=1, epochs=5, verbose=2)

# infer using the dataset
model.predict(x={},steps=4)

- ot226

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- lmartens · Accepted Answer

你使用的是哪个版本的Keras？我在代码库中找不到该特定错误消息，但我想我已经找到了它曾经存在的地方。

这是一个代码版本中的错误，我认为它与你运行的版本接近：提交这里是更新后的错误版本：https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/keras/engine/training_eager.py#L464 输入验证的条件已更改（在最新版本中将接受您的输入），但相关的是错误消息更加清晰：

raise ValueError(
    'Please provide data as a list or tuple of 1, 2, or 3 elements '
    ' - `(input)`, or `(input, target)`, or `(input, target,'
    'sample_weights)`. Received %s. We do not use the `target` or'
    '`sample_weights` value here.' % inputs.output_shapes)

在预测功能中，目标值从未被使用，因此可以是任何值。查看函数的其余部分，next_element[1] 从未被使用。

[简而言之] 使用您当前的版本，在数据中添加一个虚拟目标值，或更新您的Keras。