Valgrind在GCC 8.3 + Linux上使用boost::thread_specific_ptr时出现错误

11
  • 在Docker中运行的Ubuntu 19
  • GCC 8.3
  • Boost 1.69
  • Valgrind 3.14.0

应用程序关闭时,Valgrind报告了这3个问题:

==70== Mismatched free() / delete / delete []
==70==    at 0x483997B: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==70==    by 0x4870C89: check_free (dlerror.c:202)
==70==    by 0x4870C89: check_free (dlerror.c:186)
==70==    by 0x4870C89: free_key_mem (dlerror.c:221)
==70==    by 0x4870C89: __dlerror_main_freeres (dlerror.c:239)
==70==    by 0x4B59711: __libc_freeres (in /usr/lib/x86_64-linux-gnu/libc-2.29.so)
==70==    by 0x482E19E: _vgnU_freeres (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_core-amd64-linux.so)
==70==    by 0x4A0A3A9: __run_exit_handlers (exit.c:132)
==70==    by 0x4A0A3D9: exit (exit.c:139)
==70==    by 0x49E9B71: (below main) (libc-start.c:342)
==70==  Address 0x4f6a570 is 0 bytes inside a block of size 312 alloc'd
==70==    at 0x4838DBF: operator new(unsigned long) (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==70==    by 0x303D6D: boost::detail::make_external_thread_data() (in /build-context/bin/debug/setmatch-tests)
==70==    by 0x305424: boost::detail::add_new_tss_node(void const*, boost::shared_ptr<boost::detail::tss_cleanup_function>, void*) (in /build-context/bin/debug/setmatch-tests)
==70==    by 0x3054ED: boost::detail::set_tss_data(void const*, 

[...]

==70== Invalid free() / delete / delete[] / realloc()
==70==    at 0x483997B: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==70==    by 0x4870BB4: free_key_mem (dlerror.c:223)
==70==    by 0x4870BB4: __dlerror_main_freeres (dlerror.c:239)
==70==    by 0x4B59711: __libc_freeres (in /usr/lib/x86_64-linux-gnu/libc-2.29.so)
==70==    by 0x482E19E: _vgnU_freeres (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_core-amd64-linux.so)
==70==    by 0x4A0A3A9: __run_exit_handlers (exit.c:132)
==70==    by 0x4A0A3D9: exit (exit.c:139)
==70==    by 0x49E9B71: (below main) (libc-start.c:342)
==70==  Address 0x4f6a570 is 0 bytes inside a block of size 312 free'd
==70==    at 0x483997B: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==70==    by 0x4870C89: check_free (dlerror.c:202)
==70==    by 0x4870C89: check_free (dlerror.c:186)
==70==    by 0x4870C89: free_key_mem (dlerror.c:221)
==70==    by 0x4870C89: __dlerror_main_freeres (dlerror.c:239)
==70==    by 0x4B59711: __libc_freeres (in /usr/lib/x86_64-linux-gnu/libc-2.29.so)
==70==    by 0x482E19E: _vgnU_freeres (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_core-amd64-linux.so)
==70==    by 0x4A0A3A9: __run_exit_handlers (exit.c:132)
==70==    by 0x4A0A3D9: exit (exit.c:139)
==70==    by 0x49E9B71: (below main) (libc-start.c:342)
==70==  Block was alloc'd at
==70==    at 0x4838DBF: operator new(unsigned long) (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==70==    by 0x303D6D: boost::detail::make_external_thread_data() (in /build-context/bin/debug/setmatch-tests)
==70==    by 0x305424: boost::detail::add_new_tss_node(void const*, boost::shared_ptr<boost::detail::tss_cleanup_function>, void*) (in /build-context/bin/debug/setmatch-tests)
==70==    by 0x3054ED: boost::detail::set_tss_data(void const*, boost::shared_ptr<boost::detail::tss_cleanup_function>, void*, bool) (in /build-context/bin/debug/setmatch-tests)
==70==    by 0x188841: boost::thread_specific_ptr<burningmime::setmatch::MatchState>::reset(burningmime::setmatch::MatchState*) (tss.hpp:105)

[...]

==70== 24 bytes in 1 blocks are definitely lost in loss record 1 of 2
==70==    at 0x4838DBF: operator new(unsigned long) (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==70==    by 0x303F50: boost::detail::make_external_thread_data() (in /build-context/bin/debug/setmatch-tests)
==70==    by 0x305424: boost::detail::add_new_tss_node(void const*, boost::shared_ptr<boost::detail::tss_cleanup_function>, void*) (in /build-context/bin/debug/setmatch-tests)
==70==    by 0x3054ED: boost::detail::set_tss_data(void const*, boost::shared_ptr<boost::detail::tss_cleanup_function>, void*, bool) (in /build-context/bin/debug/setmatch-tests)

[...]

看起来boost正在分配其线程数据,与dlerror分配其自己的线程数据相同。快速搜索指向这里的(稍微不同?)版本的dlerror 我快速查看了一下boost的代码,看起来它只是在堆上分配TSS块
这在GCC 7.3.0 + Ubuntu 18(相同的Boost版本)中没有问题
有人对此有什么见解吗?
编辑:也许这是在这个提交中修复的双重释放?但我不明白为什么Boost会使用它。

你没有使用最新的Valgrind 3.15.0,有什么原因吗? - entpnerd
3个回答

4
请检查您所使用的所有工具的版本。看起来存在一些版本兼容性问题。尝试使用valgrind的3.15.0版本。
有关valgrind的用法,请参见此处

2
如果我修改 glibc 上游测试用例 中关于 pthread_setspecific 调用的部分,像这样(并使用 g++ 编译它):
    void *ptr = new char;
    printf("Setting thread local to ptr.\n");
    if (pthread_setspecific(key, ptr) != 0) {
      perror("pthread_setspecific");
      exit(1);
    }
    delete ptr;

当针对修复前的glibc(在提交5b06f538c5aee0389ed034f60d90a8884d6d54de时)运行(使用glibc构建树中的./testrun.sh --tool=valgrind /path/to/test)时,我会得到这个错误:
==14143== Invalid read of size 8
==14143==    at 0x483B550: check_free (dlerror.c:188)
==14143==    by 0x483BA21: free_key_mem (dlerror.c:221)
==14143==    by 0x483BA21: __dlerror_main_freeres (dlerror.c:239)
==14143==    by 0x4D06AD1: __libc_freeres (in /home/fweimer/src/gnu/glibc/build/libc.so)
==14143==    by 0x48031DE: _vgnU_freeres (vg_preloaded.c:77)
==14143==    by 0x4BDD331: __run_exit_handlers (exit.c:132)
==14143==    by 0x4BDD3C9: exit (exit.c:139)
==14143==    by 0x4BC7E21: (below main) (libc-start.c:342)
==14143==  Address 0x4d750d8 is 23 bytes after a block of size 1 free'd
==14143==    at 0x480CEFC: operator delete(void*) (vg_replace_malloc.c:586)
==14143==    by 0x401344: main (t.c:93)
==14143==  Block was alloc'd at
==14143==    at 0x480BE86: operator new(unsigned long) (vg_replace_malloc.c:344)
==14143==    by 0x4012F4: main (t.c:87)
==14143== 
==14143== Invalid free() / delete / delete[] / realloc()
==14143==    at 0x480CA0C: free (vg_replace_malloc.c:540)
==14143==    by 0x483BA29: free_key_mem (dlerror.c:223)
==14143==    by 0x483BA29: __dlerror_main_freeres (dlerror.c:239)
==14143==    by 0x4D06AD1: __libc_freeres (in /home/fweimer/src/gnu/glibc/build/libc.so)
==14143==    by 0x48031DE: _vgnU_freeres (vg_preloaded.c:77)
==14143==    by 0x4BDD331: __run_exit_handlers (exit.c:132)
==14143==    by 0x4BDD3C9: exit (exit.c:139)
==14143==    by 0x4BC7E21: (below main) (libc-start.c:342)
==14143==  Address 0x4d750c0 is 0 bytes inside a block of size 1 free'd
==14143==    at 0x480CEFC: operator delete(void*) (vg_replace_malloc.c:586)
==14143==    by 0x401344: main (t.c:93)
==14143==  Block was alloc'd at
==14143==    at 0x480BE86: operator new(unsigned long) (vg_replace_malloc.c:344)
==14143==    by 0x4012F4: main (t.c:87)

这基本上是您遇到的相同错误,只是没有嵌套在Boost中的operator new分配。所以看起来这两个错误确实是相同的。
这很有道理:由于bug 24476libdl使用未初始化的pthread_key_t值(没有先调用pthread_key_create)。对于数据段(内部键存储libdl的位置),未初始化意味着为零,当然,从测试的诊断输出中可以看出,测试(以及在您的情况下是Boost)分配的键实际上是键0:
key = 0

这段代码涉及到的libdl相当复杂,我发了一个补丁将dlerror从libdl移动到libc,并且完全避免使用POSIX线程本地存储。 总结一下:维护您使用的glibc版本的人需要将上游修复回溯到他们的源代码树中并发布更新。 我们也不得不这样做。 好的一面是,只有在您在valgrind和类似工具下运行应用程序时才会出现此错误,因为在常规进程关闭期间不调用__libc_freeres:进程很快就会退出,内核会为我们清理所有资源。 除非您在生产中使用valgrind,否则您永远不会在那里遇到此错误。 当然,在使用valgrind进行调试时,这仍然是一个令人烦恼的问题。 对此我们深感抱歉。

谢谢!很高兴听到有人实际上在这方面工作过。 - Robert Fraser
我非常害怕函数移动到glibc。clock_gettime()从librt移动到libc在向后兼容方面破坏了很多内容。 - Robert Fraser
@RobertFraser 如果出现问题,请提交错误报告或在libc-help上发布。 - Florian Weimer

1
也许你应该将valgrind版本升级到3.15.0,这应该会有所帮助。
我认为这里应该能帮助你。

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接