OMP:错误#15:初始化libiomp5.dylib,但发现libiomp5.dylib已经初始化

12
我正在尝试运行一个测试程序来检查我的Anaconda环境是否配置正确。然而,当我运行我的测试程序时,在程序设置图形时(确切地说是on_train_end()回调)出现了以下错误消息:
OMP: 错误 #15:初始化libiomp5.dylib,但发现已经初始化了libiomp5.dylib。 OMP: 提示这意味着多个OpenMP运行库的副本已经存在,因为它可能会降低性能或导致不正确的结果。最好的做法是确保只有一个OpenMP运行库链接到进程中,例如通过避免在任何库中静态链接OpenMP运行库。作为一种不安全、不支持、未记录的解决方法,您可以设置环境变量KMP_DUPLICATE_LIB_OK=TRUE,以允许程序继续执行,但这可能会导致崩溃或产生不正确的结果。有关更多信息,请参见http://www.intel.com/software/products/support/
我在我的MacBook Pro 15“ 2015上运行测试程序,安装了macOS Mojave 10.14.1。我当前安装的Anaconda发行版是https://repo.anaconda.com/archive/Anaconda2-5.3.0-MacOSX-x86_64.sh
以下是测试程序:
#!/usr/bin/env python

import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt

from tensorflow import keras

Xs = np.array([
    [0, 0],
    [0, 1],
    [1, 1],
    [1, 0]
])

Ys = np.array([
    [0],
    [1],
    [0],
    [1]
])

class MyCallback(keras.callbacks.Callback):
    def __init__(self):
        super(MyCallback, self).__init__()
        self.stats = []

    def on_epoch_end(self, epoch, logs=None):
        self.stats.append({
            'loss': logs['loss'],
            'acc': logs['acc'],
            'epoch': epoch
        })

    def on_train_end(self, logs=None):
        loss_x = []
        loss_y = []
        acc_x = []
        acc_y = []
        for e in self.stats:
            loss_x.append(e['epoch'])
            loss_y.append(e['loss'])
            acc_x.append(e['epoch'])
            acc_y.append(e['acc'])
        plt.plot(loss_x, loss_y, 'r', label='Loss')
        plt.plot(acc_x, acc_y, 'b', label='Accuracy')
        plt.xlabel('Epochs')
        plt.ylabel('Loss / Accuracy')
        plt.legend(loc='upper left')
        plt.show()

with tf.Session() as session:
    model = keras.models.Sequential()

    model.add(keras.layers.Dense(10, activation=keras.activations.elu, input_dim=2))
    model.add(keras.layers.Dense(1, activation=keras.activations.sigmoid))

    model.compile(optimizer=keras.optimizers.Adam(lr=0.05),
                  loss=keras.losses.mean_squared_error,
                  metrics=['accuracy'])

    model.fit(x=Xs, y=Ys, batch_size=4, epochs=50, callbacks=[MyCallback()])

    print("Training complete")

    loss, acc = model.evaluate(Xs, Ys)

    print(f"loss: {loss} - acc: {acc}")

    predictions = model.predict(Xs)

    print("predictions")
    print(predictions)

我已经尝试参考这个相关问题的答案来解决问题。因此,在import部分之后添加以下代码行:

import os
os.environ['KMP_DUPLICATE_LIB_OK'] = 'True'

我得到的是另一个错误信息,以下是完整的堆栈跟踪:

2018-12-06 10:18:34.262 python[19319:371282] -[NSApplication _setup:]: unrecognized selector sent to instance 0x7ff2b07a3d00
2018-12-06 10:18:34.266 python[19319:371282] *** Terminating app due to uncaught exception 'NSInvalidArgumentException', reason: '-[NSApplication _setup:]: unrecognized selector sent to instance 0x7ff2b07a3d00'
*** First throw call stack:
(
        0   CoreFoundation                      0x00007fff2ccf0e65 __exceptionPreprocess + 256
        1   libobjc.A.dylib                     0x00007fff58d47720 objc_exception_throw + 48
        2   CoreFoundation                      0x00007fff2cd6e22d -[NSObject(NSObject) __retain_OA] + 0
        3   CoreFoundation                      0x00007fff2cc92820 ___forwarding___ + 1486
        4   CoreFoundation                      0x00007fff2cc921c8 _CF_forwarding_prep_0 + 120
        5   libtk8.6.dylib                      0x0000000b36aeb31d TkpInit + 413
        6   libtk8.6.dylib                      0x0000000b36a4317e Initialize + 2622
        7   _tkinter.cpython-36m-darwin.so      0x0000000b3686ba16 _tkinter_create + 1174
        8   python                              0x000000010571c088 _PyCFunction_FastCallDict + 200
        9   python                              0x00000001057f2f4f call_function + 143
        10  python                              0x00000001057f0abf _PyEval_EvalFrameDefault + 46847
        11  python                              0x00000001057e4209 _PyEval_EvalCodeWithName + 425
        12  python                              0x00000001057f3b1c _PyFunction_FastCallDict + 364
        13  python                              0x000000010569a8b0 _PyObject_FastCallDict + 320
        14  python                              0x00000001056c1fe8 method_call + 136
        15  python                              0x00000001056a1efe PyObject_Call + 62
        16  python                              0x0000000105743385 slot_tp_init + 117
        17  python                              0x00000001057478c1 type_call + 241
        18  python                              0x000000010569a821 _PyObject_FastCallDict + 177
        19  python                              0x00000001056a2a67 _PyObject_FastCallKeywords + 327
        20  python                              0x00000001057f3048 call_function + 392
        21  python                              0x00000001057f0b6f _PyEval_EvalFrameDefault + 47023
        22  python                              0x00000001057f330c fast_function + 188
        23  python                              0x00000001057f2fac call_function + 236
        24  python                              0x00000001057f0abf _PyEval_EvalFrameDefault + 46847
        25  python                              0x00000001057e4209 _PyEval_EvalCodeWithName + 425
        26  python                              0x00000001057f3b1c _PyFunction_FastCallDict + 364
        27  python                              0x000000010569a8b0 _PyObject_FastCallDict + 320
        28  python                              0x00000001056c1fe8 method_call + 136
        29  python                              0x00000001056a1efe PyObject_Call + 62
        30  python                              0x00000001057f0cc0 _PyEval_EvalFrameDefault + 47360
        31  python                              0x00000001057e4209 _PyEval_EvalCodeWithName + 425
        32  python                              0x00000001057f33ba fast_function + 362
        33  python                              0x00000001057f2fac call_function + 236
        34  python                              0x00000001057f0abf _PyEval_EvalFrameDefault + 46847
        35  python                              0x00000001057f330c fast_function + 188
        36  python                              0x00000001057f2fac call_function + 236
        37  python                              0x00000001057f0abf _PyEval_EvalFrameDefault + 46847
        38  python                              0x00000001057e4209 _PyEval_EvalCodeWithName + 425
        39  python                              0x00000001057f33ba fast_function + 362
        40  python                              0x00000001057f2fac call_function + 236
        41  python                              0x00000001057f0abf _PyEval_EvalFrameDefault + 46847
        42  python                              0x00000001057e4209 _PyEval_EvalCodeWithName + 425
        43  python                              0x00000001057f33ba fast_function + 362
        44  python                              0x00000001057f2fac call_function + 236
        45  python                              0x00000001057f0b6f _PyEval_EvalFrameDefault + 47023
        46  python                              0x00000001057e4209 _PyEval_EvalCodeWithName + 425
        47  python                              0x00000001057f33ba fast_function + 362
        48  python                              0x00000001057f2fac call_function + 236
        49  python                              0x00000001057f0abf _PyEval_EvalFrameDefault + 46847
        50  python                              0x00000001057e4209 _PyEval_EvalCodeWithName + 425
        51  python                              0x00000001057f33ba fast_function + 362
        52  python                              0x00000001057f2fac call_function + 236
        53  python                              0x00000001057f0abf _PyEval_EvalFrameDefault + 46847
        54  python                              0x00000001057e4209 _PyEval_EvalCodeWithName + 425
        55  python                              0x00000001057f33ba fast_function + 362
        56  python                              0x00000001057f2fac call_function + 236
        57  python                              0x00000001057f0b6f _PyEval_EvalFrameDefault + 47023
        58  python                              0x00000001057e4209 _PyEval_EvalCodeWithName + 425
        59  python                              0x00000001057f33ba fast_function + 362
        60  python                              0x00000001057f2fac call_function + 236
        61  python                              0x00000001057f0b6f _PyEval_EvalFrameDefault + 47023
        62  python                              0x00000001057e4209 _PyEval_EvalCodeWithName + 425
        63  python                              0x000000010583cd4c PyRun_FileExFlags + 252
        64  python                              0x000000010583c224 PyRun_SimpleFileExFlags + 372
        65  python                              0x0000000105862d66 Py_Main + 3734
        66  python                              0x0000000105692929 main + 313
        67  libdyld.dylib                       0x00007fff59e1608d start + 1
        68  ???                                 0x0000000000000002 0x0 + 2
)
libc++abi.dylib: terminating with uncaught exception of type NSException

以下是安装在环境中相关依赖项的列表(为了简洁起见,未列出不相关依赖项):

Name                |     Version                      Build
--------------------|----------------|----------------------
_tflow_select       |     2.3.0      |                   mkl
blas                |     1.0        |                   mkl
intel-openmp        |     2019.1     |                   144
matplotlib          |     3.0.1      |        py36h54f8f79_0
mkl                 |     2018.0.3   |                     1
mkl_fft             |     1.0.6      |        py36hb8a8100_0
mkl_random          |     1.0.1      |        py36h5d10147_1
numpy               |     1.15.4     |        py36h6a91979_0
numpy-base          |     1.15.4     |        py36h8a80b8c_0
tensorboard         |     1.12.0     |        py36hdc36e2c_0
tensorflow          |     1.12.0     |    mkl_py36h2b2bbaf_0
tensorflow-base     |     1.12.0     |    mkl_py36h70e0e9a_0

4个回答

19
在大多数情况下,这可以解决问题。
conda install nomkl

1
谢谢,这个方法对我有用。你知道为什么它解决了问题以及是什么原因导致的吗? - Julien Perrenoud
2
@JulienPerrenoud 抱歉,不知道为什么或是什么原因 - 只是在 GitHub 的一个隐藏线程中搜索了几天后找到了这个链接... - Agile Bean
1
我原以为这意味着根本不会使用MKL,性能会受到影响。但显然情况并非如此 - MKL是间接使用的。请参见https://dev59.com/HVQJ5IYBdhLWcg3w05f2#58869103。在我的MacBook上,使用Core i9 CPU进行简单的多层感知器时,使用和不使用nomkl的性能相同。 - Richard Möhn
一开始这个方法对我没用,看起来我的Anaconda安装有问题(可能是因为我先用了miniconda,然后试图使用完整的Anaconda包进行更新)。我不得不彻底删除Anaconda并重新安装它,这个修复方法才有效。 - Visya
1
我进行了大量的研究,最终找到了这个解决方案,它对我很有效,感谢@AgileBean提供的解决方案。 - Abhishek Mamdapure

4

我尝试了一些我找到的解决方案。不幸的是,其中许多都没有成功,并且它们失败的原因也不是很清楚:

我在 mac OS Mojave 上使用安装了Tensorflow 2.0 MKLpython3.6 的 conda。

  1. To downgrade matplotlib. What does it have to do something with OpenMP? Reason not clear but it did not work out.

    conda install matplotlib==2.2.3 
    
  2. Allow duplication of OpenMP library because of multiple copies of it exist. This works out but in the warning log, it says this is a workaround and silently produce incorrect results. So, definitely, this is not the way to go, therefore still a proper solution/fix is required.

    import os
    os.environ['KMP_DUPLICATE_LIB_OK']='True'
    
  3. To install nomkl. I guess this is to not use MKL based binaries for all the libraries (scipy, numpy, tensorflow etc) but then I do not get the point why to use Tensorflow-MKL? because the whole point is to use MKL binaries to take advantage of the Intel architecture to do fast processing (AVX2 instructions etc). Most of the people said this worked out for them, however, this did not work out for me:

    conda install nomkl
    
  4. Update MKL. It did not work out.

    conda install -c intel mkl
    
  5. Uninstall OpenMP and install it again. It did not work out.

    conda uninstall openmp
    conda install openmp
    
  6. Finally, what I did is to uninstall conda installed tensorflow (tf-mkl) and install it again via pip. This has worked out!!! I think this is a proper solution. So, it might mean Intel TF-MKL binaries are broken for macOS. I have an observation this is common for Intel and macOS since other libraries like OpenVINO, pyrealsense2 etc are also not working well in macOS.

    conda uninstall tensorflow
    pip install tensorflow==2.0.0 
    

一些有用的链接:

  1. https://github.com/dmlc/xgboost/issues/1715
  2. https://github.com/openai/spinningup/issues/16
  3. Error #15: 初始化libiomp5.dylib,但发现已经初始化了libiomp5.dylib

卸载conda的tensorflow并从pip重新安装是我成功的方法。感谢分享! - beachwood23

1

我有类似的经历,并且在其他地方发布的解决方案无法解决我的问题。最终,我通过降级我的matplotlib版本来解决了这个问题,即conda install matplotlib=2.2.3


非常感谢!我都快疯了。 - Mattia Vandi
如果您使用英特尔架构,我建议使用 conda install -n myenv -c intel matplotlib。这将安装 matplotlib 的 LTS 版本(即版本 2.2.3),并且在使用 conda update --all -y 升级所有软件包时,可以防止升级到 3.x.y - Mattia Vandi

0
我一直遇到这个错误,似乎与基于依赖项的安装有关,并且在此之后conda缺少符号链接。
例如:我在conda环境中使用pip安装了一个带有torch依赖项的软件包,它确实成功安装了-但是当导入时,我得到了上面的错误。 lib/ 看起来如下所示:
~/opt/anaconda3/lib  ll|grep libomp
lrwxr-xr-x    1 user  staff    12B Dec 31 12:17 libgomp.1.dylib -> libomp.dylib
lrwxr-xr-x    1 user  staff    12B Dec 31 12:17 libgomp.dylib -> libomp.dylib
lrwxr-xr-x    1 user  staff    12B Dec 31 12:17 libiomp5.dylib -> libomp.dylib
-rwxrwxr-x    1 iser  staff   642K Dec 31 12:17 libomp.dylib 

接着我使用了 conda install pytorch 命令,它会安装额外的包。之后我的 lib/ 目录看起来是这样的:

 ~/opt/anaconda3/lib  ll|grep libomp
lrwxr-xr-x    1 user  staff    12B Dec 31 12:17 libgomp.1.dylib -> libomp.dylib
lrwxr-xr-x    1 user  staff    12B Dec 31 12:17 libgomp.dylib -> libomp.dylib
lrwxr-xr-x    1 user  staff    12B Mar 10 14:59 libiomp5.dylib -> libomp.dylib
-rwxrwxr-x    2 user  staff   646K Jan 15 22:21 libomp.dylib 

因此,libomp.dyliblibiomp5.dylib符号链接得到更新。导入之后工作正常。
我以前也通过手动在这些库之间创建符号链接来解决过这个问题... 所以请检查一下这对你是否有效!

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接