让TensorFlow使用由自定义CUDA例程即时生成的训练数据

Question

让TensorFlow使用由自定义CUDA例程即时生成的训练数据

pythontensorflowgpu

11

假设我们生成自己的训练数据（例如从某个扩散过程中采样并计算一些感兴趣的数量），并且我们有自己的CUDA程序，名为generate_data，用于为给定的输入在GPU内存中生成标签。

因此，我们处于一个特殊的设置中，在这个设置中，我们可以以“在线”方式生成任意多的训练数据批次（在每个批次迭代中，我们调用generate_data例程来生成一个新批次并舍弃旧批次）。

由于数据是在GPU上生成的，有没有办法让TensorFlow（Python API）在训练过程中直接使用它？（例如填充占位符）这样，这样的管道将高效。

我的理解是，在目前的设置中，您需要将数据从GPU复制到CPU，然后再让TensorFlow将其从CPU再次复制到GPU，这样做是很浪费的，因为会执行不必要的复制。

编辑：如果有帮助，我们可以假设CUDA例程是使用Numba的CUDA JIT编译器实现的。

- BS.

可能重复：https://dev59.com/_FgQ5IYBdhLWcg3w6IKg - user4668606

1

这将是一个有趣的功能，因为与外部GPU数据交互将开启许多可能性，但我认为目前没有任何类似的东西。TensorFlow通过C++中的许多层和抽象使用CUDA。我想你几乎肯定需要至少编写自定义操作来实现这一点，而且我不确定在没有进一步修改库的情况下是否可能。 - jdehesa

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Chan Kha Vu · Accepted Answer

这绝对不是一个完整的答案，但希望能有所帮助。最初的回答。

You can integrate your CUDA routine to TensorFlow by writing a custom op. There is currently no other way in TensorFlow to interact with other CUDA routines.

As for writing a training loop entirely on GPU, we can write the routine on GPU using tf.while_loop, in a very similar way to this SO question:

i = tf.Variable(0, name='loop_i')

def cond(i):
    return i < n

def body(i):
    # Building the graph for custom routine and our model
    x, ground_truth = CustomCUDARountine(random_seed, ...)
    predictions = MyModel(x, ...)

    # Defining the optimizer
    loss = loss_func(ground_truth, predictions)
    optim = tf.train.GradientDescentOptimizer().minimize(loss)

    # loop body
    return tf.tuple([tf.add(i, 1)], control_inputs=[optim])

loop = tf.while_loop(cond, body, [i])

# Run the loop
tf.get_default_session().run(loop)