无法训练我的Keras模型:（数据的基数不明确）

Question

无法训练我的Keras模型:（数据的基数不明确）

machine-learningnlptext-classificationtensorflow2.0tf.keras

7

我将使用bert-for-tf2库来处理多类分类问题。我已经创建了模型，但训练时出现以下错误：

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-25-d9f382cba5d4> in <module>()
----> 1 model.fit([INPUT_IDS,INPUT_MASKS,INPUT_SEGS], list(train.SECTION))

5 frames
/tensorflow-2.0.0/python3.6/tensorflow_core/python/keras/engine/data_adapter.py in 
__init__(self, x, y, sample_weights, batch_size, epochs, steps, shuffle, **kwargs)
243             label, ", ".join([str(i.shape[0]) for i in nest.flatten(data)]))
244       msg += "Please provide data which shares the same first dimension."
--> 245       raise ValueError(msg)
246     num_samples = num_samples.pop()
247 

ValueError: Data cardinality is ambiguous:
x sizes: 3
y sizes: 6102
Please provide data which shares the same first dimension.

我正在参考一篇名为使用TensorFlow 2.0实现简单BERT的Medium文章。该库bert-for-tf2的git存储库可以在这里找到。请在这里找到整个代码。这里是我的colab笔记本链接。非常感谢您的帮助！

- Amal Vijayan

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Yoganand · Accepted Answer

我遇到了同样的问题，不知道为什么输入和输出的数量应该相同。当x.shape[0] != y.shape[0]时，出现了这个错误，似乎是由其中一个数据适配器引起的。

x = [INPUT_IDS,INPUT_MASKS,INPUT_SEGS]
y = list(train.SECTION)

所以，不要使用

，而是

model.fit([INPUT_IDS,INPUT_MASKS,INPUT_SEGS], list(train.SECTION))

尝试使用层名称（检查模型摘要（也可以明确给出适当的名称））在字典中提供输入和输出，对我很有效。

model.fit(
     {
     "input_word_ids": INPUT_IDS,
     "input_mask": INPUT_MASKS,
     "segment_ids": INPUT_SEGS,
     },
    {"dense_1": list(train.SECTION)}
)

请确保输入和输出是numpy数组，例如：使用np.asarray()，它会查找.shape属性。