I have a project that requires C++11, so I separate the files into two categories: those that use C++11, and those that use C++03 and hence are compatible with the nvcc compiler. When I have a kernel that is not a template function, it is easy to load the module and find the function name using cuModuleGetDataEx
. However, when the kernel is a template, the function name is mangled after explicit specialization. This makes it difficult to obtain a handle to the function after loading the module using the CUDA Driver API. For example, consider this function.
template <class T, class SizeType>
<strong>global</strong> void
vector_add(const T* a, const T* b, T* c, const SizeType dim)
{
const SizeType i = blockIdx.x * blockDim.x + threadIdx.x;
if (i < dim) { c[i] = a[i] + b[i]; }
}
在将它编译成PTX代码后,变形的名称是_Z10vector_addIfjEvPKT_S2_PS0_T0_
。 我如何轻松地从我的主机代码中查找和加载模板内核函数,而无需手动在文件中查找并复制它们的名称?
cuModuleGetDataEx
后,我仍然需要知道函数的名称才能检索到它的句柄。 - void-pointer--ptxas-options
标志在构建过程中指定相同的JIT选项。尽管如此,我仍然想知道是否有更优雅的解决方案。 - void-pointer