在"C"头文件中声明的静态函数

53

在IT技术中,我通常会把静态函数定义和声明放在源文件中,也就是.c文件中。

然而,在极少数情况下,我看到有人将其声明在头文件中。由于静态函数具有内部链接性,我们需要在每个包含声明该函数的头文件的文件中定义它。这看起来很奇怪,并且与我们通常声明静态内容的想法相去甚远。

另一方面,如果有人尝试使用未定义的函数,编译器将会报错。因此,在某种意义上,即使听起来很奇怪,这样做并不是真正的不安全。

我的问题是:

  • 在头文件中声明静态函数的问题是什么?
  • 有哪些风险?
  • 对编译时间有何影响?
  • 运行时是否存在任何风险?

我宁愿问:做那件事的一个可以接受的理由是什么?在我看来,这绝对是代码异味。 - Erich Kitzmueller
7
有时候小的static inline函数被用作结构体数据类型的访问器函数,作为预处理宏的(更好的)替代品。与宏不同,静态内联函数提供了编译时类型检查,并且可以多次引用其参数(这在预处理宏中是有问题的)。 - Nominal Animal
1
@NominalAnimal 很好的观点。查看源代码使编译器能够进行更好的优化(其中之一是内联),而无需链接器的帮助。 - Peter - Reinstate Monica
有人能说一下在许多编译单元中反复定义同一个函数的大小影响吗?我认为链接器(假设不同的TU)无法将它们折叠成一个,因此目标代码会被乘以? - Peter - Reinstate Monica
@PeterA.Schneider:对于宏替换的静态内联函数来说,最小化很重要。函数调用往往需要一些寄存器调整(以传递参数),而许多编译器可以避免这样做,而是使用已经有数据的寄存器。当然,情况各不相同,但我认为如果内联函数显着增加了代码大小,则首先不应内联该函数。 - Nominal Animal
显示剩余5条评论
3个回答

42

首先,我想澄清一下我对您所描述情况的理解:头文件只包含一个静态函数声明,而C文件包含函数的定义,即函数的源代码。例如:

some.h:

static void f();
// potentially more declarations

some.c:

一些.c文件:
#include "some.h"
static void f() { printf("Hello world\n"); }
// more code, some of it potentially using f()
如果您描述的情况是这样的,我对您的评论表示异议。
由于静态函数具有内部链接,我们需要在每个文件中定义它,在其中包含函数声明的头文件中。如果您在给定的翻译单元中声明了该函数但未使用它,我认为您不必定义它。GCC会接受并发出警告;标准似乎没有禁止它,除非我漏掉了什么。在您的场景中,这可能很重要,因为不使用该函数但包括其声明的头的翻译单元不必提供未使用的定义。
现在让我们来看看问题:
什么是在头文件中声明静态函数的问题? 这有点不寻常。通常,静态函数是仅在一个文件中需要的函数。将它们声明为静态以通过限制其可见性来使其明确。因此,在头文件中声明它们有些相反。如果确实在多个文件中使用具有相同定义的函数,则应将其变为外部函数,并进行单个定义。如果只有一个翻译单元实际使用它,则声明不属于头文件。 因此,一个可能的场景是确保不同实现在各自的翻译单元中具有统一的函数签名。公共头导致在C(和C ++)中的不同返回类型中编译时错误;不同的参数类型只会在C中(但不会在C ++中,因为函数重载)导致编译时错误。
有什么风险? 在您的情况下,我没有看到风险。(与在头文件中包含函数定义相比,这可能会违反封装原则。)
编译时间上的影响是什么? 函数声明很小,并且其复杂性很低,因此,在头文件中具有其他函数声明的开销可能可以忽略不计。但是,如果您创建并包含一个附加的头以在许多翻译单元中进行声明,则文件处理开销可能会很大(即编译器在等待头I/O时空闲很多时间)。
运行时是否存在风险? 我看不到任何风险。

21
这不是对所述问题的回答,但希望能说明为什么可能会在头文件中实现静态(或静态内联)函数。我个人只能想到两个在头文件中声明某些函数为静态的好理由:
  1. If the header file completely implements an interface that should only be visible in the current compilation unit

    This is extremely rare, but might be useful in e.g. an educational context, at some point during the development of some example library; or perhaps when interfacing to another programming language with minimal code.

    A developer might choose to do so if the library or interaface implementation is trivial and nearly so, and ease of use (to the developer using the header file) is more important than code size. In these cases, the declarations in the header file often use preprocessor macros, allowing the same header file to be included more than once, providing some sort of crude polymorphism in C.

    Here is a practical example: Shoot-yourself-in-the-foot playground for linear congruential pseudorandom number generators. Because the implementation is local to the compilation unit, each compilation unit will get their own copies of the PRNG. This example also shows how crude polymorphism can be implemented in C.

    prng32.h:

    #if defined(PRNG_NAME) && defined(PRNG_MULTIPLIER) && defined(PRNG_CONSTANT) && defined(PRNG_MODULUS)
    #define MERGE3_(a,b,c) a ## b ## c
    #define MERGE3(a,b,c) MERGE3_(a,b,c)
    #define NAME(name) MERGE3(PRNG_NAME, _, name)
    
    static uint32_t NAME(state) = 0U;
    
    static uint32_t NAME(next)(void)
    {
        NAME(state) = ((uint64_t)PRNG_MULTIPLIER * (uint64_t)NAME(state) + (uint64_t)PRNG_CONSTANT) % (uint64_t)PRNG_MODULUS;
        return NAME(state);
    }
    
    #undef NAME
    #undef MERGE3
    #endif
    
    #undef PRNG_NAME
    #undef PRNG_MULTIPLIER
    #undef PRNG_CONSTANT
    #undef PRNG_MODULUS
    

    An example using the above, example-prng32.h:

    #include <stdlib.h>
    #include <stdint.h>
    #include <stdio.h>
    
    #define PRNG_NAME       glibc
    #define PRNG_MULTIPLIER 1103515245UL
    #define PRNG_CONSTANT   12345UL
    #define PRNG_MODULUS    2147483647UL
    #include "prng32.h"
    /* provides glibc_state and glibc_next() */
    
    #define PRNG_NAME       borland
    #define PRNG_MULTIPLIER 22695477UL
    #define PRNG_CONSTANT   1UL
    #define PRNG_MODULUS    2147483647UL
    #include "prng32.h"
    /* provides borland_state and borland_next() */
    
    int main(void)
    {
        int i;
    
        glibc_state = 1U;
        printf("glibc lcg: Seed %u\n", (unsigned int)glibc_state);
        for (i = 0; i < 10; i++)
            printf("%u, ", (unsigned int)glibc_next());
        printf("%u\n", (unsigned int)glibc_next());
    
        borland_state = 1U;
        printf("Borland lcg: Seed %u\n", (unsigned int)borland_state);
        for (i = 0; i < 10; i++)
            printf("%u, ", (unsigned int)borland_next());
        printf("%u\n", (unsigned int)borland_next());
    
        return EXIT_SUCCESS;
    }
    

    The reason for marking both the _state variable and the _next() function static is that this way each compilation unit that includes the header file has their own copy of the variables and the functions -- here, their own copy of the PRNG. Each must be separately seeded, of course; and if seeded to the same value, will yield the same sequence.

    One should generally shy away from such polymorphism attempts in C, because it leads to complicated preprocessor macro shenanigans, making the implementation much harder to understand, maintain, and modify than necessary.

    However, when exploring the parameter space of some algorithm -- like here, the types of 32-bit linear congruential generators, this lets us use a single implementation for each of the generators we examine, ensuring there are no implementation differences between them. Note that even this case is more like a development tool, and not something you ought to see in a implementation provided for others to use.


  1. If the header implements simple static inline accessor functions

    Preprocessor macros are commonly used to simplify code accessing complicated structure types. static inline functions are similar, except that they also provide type checking at compile time, and can refer to their parameters several times (with macros, that is problematic).

    One practical use case is a simple interface for reading files using low-level POSIX.1 I/O (using <unistd.h> and <fcntl.h> instead of <stdio.h>). I've done this myself when reading very large (dozens of megabytes to gigabytes range) text files containing real numbers (with a custom float/double parser), as the GNU C standard I/O is not particularly fast.

    For example, inbuffer.h:

    #ifndef   INBUFFER_H
    #define   INBUFFER_H
    
    typedef struct {
        unsigned char  *head;       /* Next buffered byte */
        unsigned char  *tail;       /* Next byte to be buffered */
        unsigned char  *ends;       /* data + size */
        unsigned char  *data;
        size_t          size;
        int             descriptor;
        unsigned int    status;     /* Bit mask */
    } inbuffer;
    #define INBUFFER_INIT { NULL, NULL, NULL, NULL, 0, -1, 0 }
    
    int inbuffer_open(inbuffer *, const char *);
    int inbuffer_close(inbuffer *);
    
    int inbuffer_skip_slow(inbuffer *, const size_t);
    int inbuffer_getc_slow(inbuffer *);
    
    static inline int inbuffer_skip(inbuffer *ib, const size_t n)
    {
        if (ib->head + n <= ib->tail) {
            ib->head += n;
            return 0;
        } else
            return inbuffer_skip_slow(ib, n);
    }
    
    static inline int inbuffer_getc(inbuffer *ib)
    {
        if (ib->head < ib->tail)
            return *(ib->head++);
        else
            return inbuffer_getc_slow(ib);
    }
    
    #endif /* INBUFFER_H */
    

    Note that the above inbuffer_skip() and inbuffer_getc() do not check if ib is non-NULL; this is typical for such functions. These accessor functions are assumed to be "in the fast path", i.e. called very often. In such cases, even the function call overhead matters (and is avoided with static inline functions, since they are duplicated in the code at the call site).

    Trivial accessor functions, like the above inbuffer_skip() and inbuffer_getc(), may also let the compiler avoid the register moves involved in function calls, because functions expect their parameters to be located in specific registers or on the stack, whereas inlined functions can be adapted (wrt. register use) to the code surrounding the inlined function.

    Personally, I do recommend writing a couple of test programs using the non-inlined functions first, and compare the performance and results to the inlined versions. Comparing the results ensure the inlined versions do not have bugs (off by one type is common here!), and comparing the performance and generated binaries (size, at least) tells you whether inlining is worth it in general.


在示例1中,你提到将某些内容放入头文件中的原因是“应该只在此TU中可见”,这让我感到有些奇怪!通常情况下,头文件用于相反的目的。但我理解了你的用例:你定义了一个可配置的框架,它与所有使用TU的人都相同(即共享),但每个TU可以根据其特定需求配置机制。 - Peter - Reinstate Monica
1
@miguelazevedo 在 StackOverflow 上表达感激之情的规范方式是给答案点赞和/或将其标记为被接受的答案;-)。 - Peter - Reinstate Monica
@Peter A. Schneider,我尝试了,但似乎我没有那个权限。 - miguel azevedo
@PeterA.Schneider:没错。 - Nominal Animal
@miguelazevedo:这是因为你是新成员;在 Stack Overflow 和相关网站上,随着你获得声望,你会获得额外的特权。例如,要点赞一篇帖子,你需要拥有 15 点或更多的声望。所以,暂时不用担心点赞。此外,你随时可以回来点赞 -- 毕竟每个答案只能点赞一次。 - Nominal Animal

1
为什么要同时使用全局和静态函数?在c语言中,函数默认是全局的。只有当您想将对函数的访问限制在声明它们的文件内时,才会使用静态函数。因此,通过声明它为静态,您可以积极地限制对其的访问...
唯一需要在头文件中实现的是c ++模板函数和模板类成员函数。

5
这个问题被标记为 "c",因此模板是无关紧要的。 - Keith Thompson
3
“global”并不是一个精确的术语。有些人将文件作用域中的static int x;称为全局变量。 - M.M

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接