无法加载动态库'libcuda.so.1'; dlerror: libcuda.so.1：无法打开共享对象文件：没有那个文件或目录。

Question

无法加载动态库'libcuda.so.1'; dlerror: libcuda.so.1：无法打开共享对象文件：没有那个文件或目录。

3

我正在使用没有NVIDIA GPU的计算机进行Docker构建。我使用tensorflow / tensorflow Docker镜像作为基础映像，其中只有CPU。

Dockerfile

FROM tensorflow/tensorflow 
WORKDIR /project
COPY /app .
RUN python3 main.py

但是它显示错误

2020-06-12 20:06:56.822576: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2020-06-12 20:06:56.825090: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: UNKNOWN ERROR (303)
2020-06-12 20:06:56.827746: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (735abddf4141): /proc/driver/nvidia/version does not exist
2020-06-12 20:06:56.837312: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-06-12 20:06:57.040593: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 2400000000 Hz
2020-06-12 20:06:57.045853: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f377c000b20 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-06-12 20:06:57.045913: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-06-12 20:07:07.017642: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 614400000 exceeds 10% of free system memory.
Killed

运行这段代码时

model = tf.keras.models.Sequential([
                        tf.keras.layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)),
                        tf.keras.layers.MaxPooling2D(2, 2),
                        tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
                        tf.keras.layers.MaxPooling2D(2, 2),
                        tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
                        tf.keras.layers.Flatten(),
                        tf.keras.layers.Dense(128, activation='relu'),
                        tf.keras.layers.Dense(10, activation='softmax')])

我不想使用GPU版本，我需要使用CPU来运行。

- Bumuthu Dilshan

我认为这个问题与CUDA无关，"Killed"消息表明内核终止了你的进程，因为它使用了太多的内存。 - Dr. Snoopy

那么，你能提出一个解决方法吗？ - Bumuthu Dilshan

2个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- swiftsnail · Answer 1

我正在使用TensorFlow Serving并遇到了这个问题，然后我找到了基于docker镜像nvidia/cuda的TFS dockerfile.gpu，所以在安装nvidia-docker之后，解决了这个问题。希望这可以帮助到你。

- Areza · Answer 2

-1

正如 @Dr. snoopy 所评论的，这是一个内存错误。你应该增加 Docker 的内存，例如按照 this post 的方法。

- Areza

朋友，缺少一个文件与内存使用无关 - 我在tensorflow/tensorflow:2.3.1镜像中遇到了同样的问题，没有任何内存限制。 - hi im Bacon

@hiimBacon - 他正在使用 Docker 和 CPU - 缺少文件是因为他没有驱动程序，所以自动切换到 CPU - 最后一行清楚地显示由于内存问题进程被终止。我的朋友，你的负评并没有带给我任何微笑。 - Areza