我按照这里的说明安装了nvidia-docker2。在运行下列命令时,我会得到预期的输出结果。
sudo docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 495.29.05 Driver Version: 495.29.05 CUDA Version: 11.5 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... On | 00000000:0B:00.0 On | N/A |
| 24% 31C P8 13W / 250W | 222MiB / 11011MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+
然而,对我来说,如果没有使用"sudo"运行上述命令,则会产生以下错误:
$ docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi
docker: Error response from daemon: failed to create shim task: OCI runtime create
failed: runc create failed: unable to start container process: error during container
init: error running hook #0: error running hook: exit status 1, stdout: , stderr:
nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1:
cannot open shared object file: no such file or directory: unknown.
请问有谁能帮我解决这个问题吗?
$docker run hello-world
这样的命令可以正常运行,而不需要使用 'sudo',这证实了我已经将用户加入了 docker 组。但是,我的调用nvidia-smi
的问题仍未解决。 - Golchoubian