python.data.ops.dataset_ops.BatchDataset - 如何使用它创建训练和测试数据集

Question

python.data.ops.dataset_ops.BatchDataset - 如何使用它创建训练和测试数据集

6

使用TensorFlow遍历目录并获取需要在训练神经网络中使用的图像。

train_ds = tf.keras.preprocessing.image_dataset_from_directory(
    wk_dir,
    labels="inferred",
    label_mode="int",
    class_names=None,
    color_mode="grayscale",
    batch_size=batches,
    image_size=image_dim,
    shuffle=True,
    seed=1968,
    validation_split=0.2,
    subset="training",
    interpolation="bilinear",
    follow_links=False,
)

发现3个类别共127561个文件。使用102049个文件进行训练。

结果 - 它有效了...现在我正在尝试将其输入到一个模型中，但不确定如何管理它...

print(train_ds)
<BatchDataset shapes: ((None, 576, 432, None), (None,)), types: (tf.float32, tf.int32)>

我的数组里有两个元素吗？第一个有4个元素，其中2个为空，第二个元素是分类吗？
我尝试拆分BatchDatashape，但出现错误TypeError：'BatchDataset'对象不可订阅

如何操作一个类型为python.data.ops.dataset_ops.BatchDataset的TF对象？

- Cillin O Foghlu

2个回答

-1

你有一个数据集名称为-train_ds。如果你想要验证数据集，你需要再写一条语句，不同的是子集名称：

train_ds = tf.keras.preprocessing.image_dataset_from_directory(
    wk_dir,
    labels="inferred",
    label_mode="int",
    class_names=None,
    color_mode="grayscale",
    batch_size=batches,
    image_size=image_dim,
    shuffle=True,
    seed=1968,
    validation_split=0.2,
    subset="validation",
    interpolation="bilinear",
    follow_links=False,
)

- VaibhavPandey

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Ena · Accepted Answer

If you want to see how this BatchDataset looks like, you can try:
```
print(list(train_ds.as_numpy_iterator()))
```
More about TensorFlow Data and BatchDataset: https://www.tensorflow.org/guide/data#batching_dataset_elements
Looks like there is no enough information to tell you how exactly build the model, but I can recommend this course to see how to build the model with BatchDataset as model input: https://www.coursera.org/projects/fine-tune-bert-tensorflow