在Linux动态链接到libc时,需要调用`atexit`函数。

7
如果我有以下用C语言编写的程序(在Debian 8.7上使用GCC编译),我可以按预期调用 atexit()
#include <stdlib.h>

void exit_handler(void) {
    return;
}

int main () {
    atexit(exit_handler);
    return 0;
}

当我编译并运行它时:

$ gcc test.c
$ ./a.out

不输出任何内容,这是你所期望的。实际上,当我运行 ldd 时,我得到的结果是:

$ ldd a.out
    linux-vdso.so.1 (0x00007fffbe592000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fe07d3a8000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fe07d753000)

然而,libc 似乎没有任何关于 atexit 的符号,只有 __cxa_atexit__cxa_threaded_atexit_impl:

$ nm --dynamic /lib/x86_64-linux-gnu/libc.so.6 | grep 'atexit'
0000000000037d90 T __cxa_atexit
0000000000037fa0 T __cxa_thread_atexit_impl

正如你所料,如果我尝试动态链接到libc,我实际上无法调用atexit(),例如下面的Racket程序,它链接到libc并尝试查找atexit

#lang racket

(require ffi/unsafe)

(get-ffi-obj 'atexit (ffi-lib "libc" '("6")) (_fun (_fun -> _void) -> _int))

输出结果为:
$ racket findatexit.rkt
ffi-obj: couldn't get "atexit" from "libc.so.6" (/lib/x86_64-linux-gnu/libc.so.6: undefined symbol: atexit)

我想知道的是:
1. 如果Linux上的libc没有atexit符号,为什么我仍然可以从C程序中调用它? 2. 在Linux上是否有一种动态调用atexit或类似函数的方法?
(我应该注意到,在OS X上,atexit似乎是一个符号,所以在这里看起来不寻常。)
编辑:
根据@Jonathan的建议,我还运行了:
$ gcc -c test.c
$ nm test.o
                 U atexit
0000000000000000 T exit_handler
0000000000000007 T main

这似乎表明atexit符号确实存在,但它并未出现在任何ldd显示的库中。


尝试运行 gcc -c test.c; nm test.o 命令,查看其中引用的符号。 - Jonathan Leffler
好主意:$ nm test.o U atexit 0000000000000000 T exit_handler 0000000000000007 T main - Leif Andersen
好的,这意味着它以某种方式调用了atexit()。你是否查看过ld.so.1(或者对你来说,可能是/lib64/ld-linux-x86-64.so.2)中的符号?或者可能还有crt0.o,或者其他链接的文件?你可能需要运行gcc -v test.c以查看到底链接了哪些库和目标文件。 - Jonathan Leffler
嗯...根据以下命令的结果,似乎并不存在:$ nm --dynamic /lib64/ld-linux-x86-64.so.2 | grep 'atexit' - Leif Andersen
1个回答

13

我在一个Centos 7虚拟机上进行了一些探索,我想我找到了它——但它根本不明显!

找到它了!

/usr/lib64/libc_nonshared.a中:

$ nm /usr/lib64/libc_nonshared.a | grep -i atexit
atexit.oS:
0000000000000000 T atexit
                 U __cxa_atexit
$

为什么要查找那个库?这是一个很好的问题 - 也是一个漫长的故事。你舒服地坐着吗?那我就开始讲了...

到达目标所需的步骤

  1. Use the test.c code from the question.
  2. Compile it with gcc -v test.c:

    $ gcc -v test.c
    Using built-in specs.
    COLLECT_GCC=gcc
    COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/4.8.5/lto-wrapper
    Target: x86_64-redhat-linux
    Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-linker-hash-style=gnu --enable-languages=c,c++,objc,obj-c++,java,fortran,ada,go,lto --enable-plugin --enable-initfini-array --disable-libgcj --with-isl=/builddir/build/BUILD/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/isl-install --with-cloog=/builddir/build/BUILD/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/cloog-install --enable-gnu-indirect-function --with-tune=generic --with-arch_32=x86-64 --build=x86_64-redhat-linux
    Thread model: posix
    gcc version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) 
    COLLECT_GCC_OPTIONS='-v' '-mtune=generic' '-march=x86-64'
     /usr/libexec/gcc/x86_64-redhat-linux/4.8.5/cc1 -quiet -v test.c -quiet -dumpbase test.c -mtune=generic -march=x86-64 -auxbase test -version -o /tmp/ccPHTer7.s
    GNU C (GCC) version 4.8.5 20150623 (Red Hat 4.8.5-11) (x86_64-redhat-linux)
        compiled by GNU C version 4.8.5 20150623 (Red Hat 4.8.5-11), GMP version 6.0.0, MPFR version 3.1.1, MPC version 1.0.1
    GGC heuristics: --param ggc-min-expand=96 --param ggc-min-heapsize=124992
    ignoring nonexistent directory "/usr/lib/gcc/x86_64-redhat-linux/4.8.5/include-fixed"
    ignoring nonexistent directory "/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../x86_64-redhat-linux/include"
    #include "..." search starts here:
    #include <...> search starts here:
     /usr/lib/gcc/x86_64-redhat-linux/4.8.5/include
     /usr/local/include
     /usr/include
    End of search list.
    GNU C (GCC) version 4.8.5 20150623 (Red Hat 4.8.5-11) (x86_64-redhat-linux)
        compiled by GNU C version 4.8.5 20150623 (Red Hat 4.8.5-11), GMP version 6.0.0, MPFR version 3.1.1, MPC version 1.0.1
    GGC heuristics: --param ggc-min-expand=96 --param ggc-min-heapsize=124992
    Compiler executable checksum: 356f86e67978d665416e07d560c8ba0d
    COLLECT_GCC_OPTIONS='-v' '-mtune=generic' '-march=x86-64'
     as -v --64 -o /tmp/cc5WHEA4.o /tmp/ccPHTer7.s
    GNU assembler version 2.25.1 (x86_64-redhat-linux) using BFD version version 2.25.1-22.base.el7 
    COMPILER_PATH=/usr/libexec/gcc/x86_64-redhat-linux/4.8.5/:/usr/libexec/gcc/x86_64-redhat-linux/4.8.5/:/usr/libexec/gcc/x86_64-redhat-linux/:/usr/lib/gcc/x86_64-redhat-linux/4.8.5/:/usr/lib/gcc/x86_64-redhat-linux/
    LIBRARY_PATH=/usr/lib/gcc/x86_64-redhat-linux/4.8.5/:/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/:/lib/../lib64/:/usr/lib/../lib64/:/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../:/lib/:/usr/lib/
    COLLECT_GCC_OPTIONS='-v' '-mtune=generic' '-march=x86-64'
     /usr/libexec/gcc/x86_64-redhat-linux/4.8.5/collect2 --build-id --no-add-needed --eh-frame-hdr --hash-style=gnu -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 /usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/crt1.o /usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/crti.o /usr/lib/gcc/x86_64-redhat-linux/4.8.5/crtbegin.o -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64 -L/lib/../lib64 -L/usr/lib/../lib64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../.. /tmp/cc5WHEA4.o -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gcc/x86_64-redhat-linux/4.8.5/crtend.o /usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/crtn.o
    $
    
  3. The interesting part is the collect2 command line at the end. Written with one argument per line, that is:

    /usr/libexec/gcc/x86_64-redhat-linux/4.8.5/collect2
    --build-id
    --no-add-needed
    --eh-frame-hdr
    --hash-style=gnu
    -m
    elf_x86_64
    -dynamic-linker
    /lib64/ld-linux-x86-64.so.2
    /usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/crt1.o
    /usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/crti.o
    /usr/lib/gcc/x86_64-redhat-linux/4.8.5/crtbegin.o
    -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5
    -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64
    -L/lib/../lib64
    -L/usr/lib/../lib64
    -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../..
    /tmp/cc5WHEA4.o
    -lgcc
    --as-needed
    -lgcc_s
    --no-as-needed
    -lc
    -lgcc
    --as-needed
    -lgcc_s
    --no-as-needed
    /usr/lib/gcc/x86_64-redhat-linux/4.8.5/crtend.o
    /usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/crtn.o
    
  4. So, there are a bunch of cr*.o files, plus three libraries: -lc, -lgcc and -lgcc_s to look for, and a bunch of directories to look in: -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5, -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64, -L/lib/../lib64, -L/usr/lib/../lib64, -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../... The /tmp/cc5WHEA4.o is the object file created from test.c.

  5. Applying some clean-up code to the path names, and then using ls to help find the libraries yields a list of files to examine further:

    /lib64/ld-linux-x86-64.so.2
    /usr/lib64/crt1.o
    /usr/lib64/crti.o
    /usr/lib/gcc/x86_64-redhat-linux/4.8.5/crtbegin.o
    /usr/lib/gcc/x86_64-redhat-linux/4.8.5/crtend.o
    /usr/lib64/crtn.o
    /usr/lib/gcc/x86_64-redhat-linux/4.8.5/libgcc.a
    /usr/lib/gcc/x86_64-redhat-linux/4.8.5/libgcc_s.so
    /usr/lib64/libgcc_s.so.1
    /lib64/libgcc_s.so.1
    /usr/lib64/libgcc_s.so.1
    /usr/lib64/libc.so
    /usr/lib64/libc.so.6
    /lib64/libc.so
    /lib64/libc.so.6
    /usr/lib64/libc.so
    /usr/lib64/libc.so.6
    
  6. That list of files was saved in a file yy (unimaginative name), and then used in:

    $ nm -o $(<yy) | tee nm.log | grep -i atexit
    nm: _trampoline.o: no symbols
    nm: __main.o: no symbols
    nm: _ctors.o: no symbols
    nm: /usr/lib/gcc/x86_64-redhat-linux/4.8.5/libgcc_s.so: no symbols
    nm: /usr/lib64/libgcc_s.so.1: no symbols
    nm: /lib64/libgcc_s.so.1: no symbols
    nm: /usr/lib64/libgcc_s.so.1: no symbols
    nm: /usr/lib64/libc.so: File format not recognized
    /usr/lib64/libc.so.6:00000000003bcc00 b added_atexit_handler.9157
    /usr/lib64/libc.so.6:0000000000038c90 T __cxa_atexit
    /usr/lib64/libc.so.6:0000000000038c90 t __cxa_atexit_internal
    /usr/lib64/libc.so.6:00000000003b6838 d __elf_set___libc_atexit_element__IO_cleanup__
    /usr/lib64/libc.so.6:0000000000038c40 t __internal_atexit
    /usr/lib64/libc.so.6:00000000003b6838 d __start___libc_atexit
    /usr/lib64/libc.so.6:00000000003b6840 d __stop___libc_atexit
    nm: /lib64/libc.so: File format not recognized
    /lib64/libc.so.6:00000000003bcc00 b added_atexit_handler.9157
    /lib64/libc.so.6:0000000000038c90 T __cxa_atexit
    /lib64/libc.so.6:0000000000038c90 t __cxa_atexit_internal
    /lib64/libc.so.6:00000000003b6838 d __elf_set___libc_atexit_element__IO_cleanup__
    /lib64/libc.so.6:0000000000038c40 t __internal_atexit
    nm: /usr/lib64/libc.so: File format not recognized
    /lib64/libc.so.6:00000000003b6838 d __start___libc_atexit
    /lib64/libc.so.6:00000000003b6840 d __stop___libc_atexit
    /usr/lib64/libc.so.6:00000000003bcc00 b added_atexit_handler.9157
    /usr/lib64/libc.so.6:0000000000038c90 T __cxa_atexit
    /usr/lib64/libc.so.6:0000000000038c90 t __cxa_atexit_internal
    /usr/lib64/libc.so.6:00000000003b6838 d __elf_set___libc_atexit_element__IO_cleanup__
    /usr/lib64/libc.so.6:0000000000038c40 t __internal_atexit
    /usr/lib64/libc.so.6:00000000003b6838 d __start___libc_atexit
    /usr/lib64/libc.so.6:00000000003b6840 d __stop___libc_atexit
    $
    
  7. There's no evidence of a plain atexit function there. Where's it hiding, and what's with those 'File format not recognized' messages?

    $ file /usr/lib64/libc.so
    /usr/lib64/libc.so: ASCII text
    $
    
  8. ASCII text? What?

    $ cat /usr/lib64/libc.so
    /* GNU ld script
       Use the shared library, but some functions are only in
       the static library, so try that secondarily.  */
    OUTPUT_FORMAT(elf64-x86-64)
    GROUP ( /lib64/libc.so.6 /usr/lib64/libc_nonshared.a  AS_NEEDED ( /lib64/ld-linux-x86-64.so.2 ) )
    $
    
  9. OK; what's in /usr/lib64/libc_nonshared.a?

    $  nm /usr/lib64/libc_nonshared.a | grep -i atexit
    atexit.oS:
    0000000000000000 T atexit
                     U __cxa_atexit
    $
    

    Bingo! Found it!

看起来,GCC使用的collect2链接器能够加载未在命令行中列出的文件,其中一个文件是/usr/lib64/libc_nonshared.a,而这个库中有atexit()。因此,您应该能够调用atexit(),因为它被静态链接到可执行文件中...除非还有一些我没有发现的黑科技。


2
哇,太棒了,谢谢你挖掘出来,你真是太棒了。 - Leif Andersen
1
干得好,乔纳森。如果有人想知道“为什么要这样做?”,那么原因在于:这样可以在每次调用atexit()时都可用__dso_handle(共享库句柄)。这是为了共享库的模糊兼容性功能而实现的。当对共享库执行dlclose()时,该特定共享库注册的任何atexit()处理程序都会在取消映射库之前被调用。这是通过atexit()使用额外的__dso_handle调用__cxa_atexit()来完成的。这就是glibc的实现方式。其他系统(BSD、Apple)使用不同的技术来实现相同的目标。 - drudru

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接