如何正确地将CUDA头文件与设备函数链接？

Question

如何正确地将CUDA头文件与设备函数链接？

5

我正试图将代码解耦，但出现了问题。编译错误：

error: calling a __host__ function("DecoupledCallGpu") from a __global__ function("kernel") is not allowed

代码摘录：

main.c（调用了cuda主机函数）：

#include "cuda_compuations.h"
...
ComputeSomething(&var1,&var2);
...

cuda_computations.cu（包含内核、主机主函数和包含设备函数的头文件）：

#include "cuda_computations.h"
#include "decoupled_functions.cuh"
...
__global__ void kernel(){
...
DecoupledCallGpu(&var_kernel);
}

void ComputeSomething(int *var1, int *var2){
//allocate memory and etc..
...
kernel<<<20,512>>>();
//cleanup
...
}

decoupled_functions.cuh:

#ifndef _DECOUPLEDFUNCTIONS_H_
#define _DECOUPLEDFUNCTIONS_H_

void DecoupledCallGpu(int *var);

#endif

decoupled_functions.cu:

#include "decoupled_functions.cuh"

__device__ void DecoupledCallGpu(int *var){
  *var=0;
}

#endif

编译:

nvcc -g --ptxas-options=-v -arch=sm_30 -c cuda_computations.cu -o cuda_computations.o -lcudart

问题: 为什么DecoupledCallGpu是从主机函数调用而不是从内核调用呢？

P.S.: 如果需要，我可以分享实际代码。

- Denys S.

1

在你展示的所有代码片段中，都没有展示到实际列在错误消息中的“ComputeDensityGpu”或“DoColision”函数，这让我们猜测。但是我认为你在decoupled_functions.cuh中的DecoupledCallGpu原型缺少了__device__修饰符。将设备功能的编译与调用它的编译单元分离可能意味着您需要使用单独编译和链接。 - Robert Crovella

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Robert Crovella · Accepted Answer

在decoupled_functions.cuh的原型中添加__device__修饰符。这应该可以解决您看到的错误消息。

然后，您需要在模块之间使用独立编译和链接。因此，不要使用-c编译，而是使用-dc编译。并且您的链接命令将需要进行修改。一个基本示例在这里。

你的问题有点混淆:

问题：为什么DecoupledCallGpu从主机函数调用而不是内核函数调用？

我无法确定您是否对英语感到困惑，还是存在误解。实际的错误消息是:

错误: 从kernel内部不允许调用__host__函数("DecoupledCallGpu")

这是由于在编译单元内（即在模块中，在正在编译的文件内，即cuda_computations.cu），函数DecoupledCallGpu()的唯一描述是在头文件中提供的原型。

void DecoupledCallGpu(int *var);

这个原型指示了CUDA C中一个未装饰的函数，这样的函数等同于仅有__host__修饰符的函数：

__host__ void DecoupledCallGpu(int *var);

那个编译单元并不知道 decoupled_functions.cu 文件里实际包含了什么。

因此，当你有类似这样的内核代码时:

__global__ void kernel(){       //<- __global__ function
...
DecoupledCallGpu(&var_kernel);  //<- appears as a __host__ function to compiler
}

编译器认为您试图从一个__global__函数中调用__host__函数，这是不合法的。