如何在安装了numba或tensorflow的Python代码中知道每个块的最大线程数？

Question

如何在安装了numba或tensorflow的Python代码中知道每个块的最大线程数？

4

是否有Python代码可以在已安装numba或tensorflow的情况下使用？例如，如果我想知道GPU内存信息，我只需使用以下代码：

from numba import cuda
gpus = cuda.gpus.lst
for gpu in gpus:
    with gpu:
        meminfo = cuda.current_context().get_memory_info()
        print("%s, free: %s bytes, total, %s bytes" % (gpu, meminfo[0], meminfo[1]))

在Numba中。但我找不到任何能给我最大线程块信息的代码。我希望代码能检测每个块的最大线程数，并进一步计算每个方向上指定数量的块。

- ZHANG Juenjie

请您能否重新表达一下您的问题？ - Josh Abraham

2个回答

1

是否有安装numba或tensorflow的python代码？

据我所知没有。numba设备类似乎具有检索设备属性的功能：

In [9]: ddd=numba.cuda.get_current_device()

In [10]: print(ddd)
<CUDA device 0 'b'GeForce GTX 970''>

In [11]: print(ddd.attributes)
{}

但至少在我使用的numba版本（0.31.0）中，字典似乎没有被填充。此外，在这个阶段，numba似乎没有暴露检索设备或已编译函数属性的任何传统驱动程序API功能。

- talonmies

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- user3691497 · Accepted Answer

from numba import cuda
gpu = cuda.get_current_device()
print("name = %s" % gpu.name)
print("maxThreadsPerBlock = %s" % str(gpu.MAX_THREADS_PER_BLOCK))
print("maxBlockDimX = %s" % str(gpu.MAX_BLOCK_DIM_X))
print("maxBlockDimY = %s" % str(gpu.MAX_BLOCK_DIM_Y))
print("maxBlockDimZ = %s" % str(gpu.MAX_BLOCK_DIM_Z))
print("maxGridDimX = %s" % str(gpu.MAX_GRID_DIM_X))
print("maxGridDimY = %s" % str(gpu.MAX_GRID_DIM_Y))
print("maxGridDimZ = %s" % str(gpu.MAX_GRID_DIM_Z))
print("maxSharedMemoryPerBlock = %s" % str(gpu.MAX_SHARED_MEMORY_PER_BLOCK))
print("asyncEngineCount = %s" % str(gpu.ASYNC_ENGINE_COUNT))
print("canMapHostMemory = %s" % str(gpu.CAN_MAP_HOST_MEMORY))
print("multiProcessorCount = %s" % str(gpu.MULTIPROCESSOR_COUNT))
print("warpSize = %s" % str(gpu.WARP_SIZE))
print("unifiedAddressing = %s" % str(gpu.UNIFIED_ADDRESSING))
print("pciBusID = %s" % str(gpu.PCI_BUS_ID))
print("pciDeviceID = %s" % str(gpu.PCI_DEVICE_ID))

这些似乎是目前支持的所有属性。我在这里找到了这个列表，它匹配了CUDA文档中的枚举值，因此扩展起来相当容易。例如，我添加了CU_DEVICE_ATTRIBUTE_TOTAL_CONSTANT_MEMORY = 9，现在可以正常工作。

如果我有时间的话，我会尽力完善它们，更新文档并提交PR。