跨平台的格式化字符串如何处理size_t类型的变量？

Question

跨平台的格式化字符串如何处理size_t类型的变量？

33

在一个跨平台的C/C++项目中（Win32，Linux，OSX），我需要使用*printf函数来打印一些size_t类型的变量。在某些环境中，size_t是8个字节，在其他环境中则为4个字节。在glibc中，我可以使用％zd，在Win32中我可以使用%Id。是否有一种优雅的方式来处理这个问题？

- twk

注意：%zd 是 C99 标准，而微软非常不愿意实现它。 - Ciro Santilli OurBigBook.com

@CiroSantilli 这也是 C++11。 - L. F.

%zd 现在在 Visual Studio 中已经实现。另请参阅 https://dev59.com/62Uo5IYBdhLWcg3w_DlK - Étienne

10个回答

14

这里实际上有两个问题。第一个问题是三个平台的正确printf指示符字符串是什么。请注意，size_t是无符号类型。

在Windows上，请使用"%Iu"。

在Linux和OSX上，请使用"%zu"。

第二个问题是如何支持多个平台，考虑到每个平台的格式字符串可能不同。正如其他人所指出的，使用#ifdef会很快变得丑陋。

相反，为每个目标平台编写单独的makefile或项目文件。然后在源文件中通过某些宏名称引用指示符，在每个makefile中适当地定义该宏。特别地，GCC和Visual Studio都接受“D”开关以在命令行上定义宏。

如果你的构建系统非常复杂（多个构建选项、生成的源代码等），维护三个独立的 makefile 可能会变得难以控制，这时你需要使用一些高级构建系统，如 CMake 或 GNU autotools。但基本原则是相同的——使用构建系统来定义特定于平台的宏，而不是将平台检测逻辑放在源文件中。

- jkl

2

好的是指出了%Iu（win）和%zu（mac / linux）更加官方正确，比问题建议的更好。Windows正式定义了_WIN32宏，我个人发现基于此宏的小而集中的#ifdef子句比每个平台的makefile更容易开发。虽然我使用Visual Studio＆Xcode，它们实际上也有自己的版本的makefiles。区别在于最小化宏定义和#ifdef情况的数量。 - Tyler

根据 'cppcheck'，在Windows上我们可以使用两种格式。 - alcor

9

我能想到的唯一解释就是典型的情况：

#ifdef __WIN32__ // or whatever
#define SSIZET_FMT "%ld"
#else
#define SSIZET_FMT "%zd"
#endif

然后利用常量折叠的优势：

fprintf(stream, "Your size_t var has value " SSIZET_FMT ".", your_var);

- ΤΖΩΤΖΙΟΥ

呵呵——我希望不会发生这种情况。 - twk

希望其他人能提供更好的东西... - tzot

5

Dan Saks在《嵌入式系统设计》杂志上写了一篇文章，涵盖了这个问题。根据Dan的说法，%zu是标准的方式，但只有少数编译器支持。作为替代方案，他建议使用%lu，并将参数显式转换为unsigned long：

size_t n;
...
printf("%lu", (unsigned long)n);

- Frederico

1

在使用LLP64编程模型的系统上，例如64位Windows系统中，这并不是很好。 - bk1e

%zu是C99的发明。这些编译器确实很少见。C++一开始就没有这个问题。 - MSalters

1

%zu与编译器无关，而与标准库有关... - plinth

%lu和unsigned long只有在您想要显示超过2^32-1的值时才会在64位系统上出现问题。否则，始终将其转换为unsigned long long（对于32位也同样有效），并使用%llu。 - paniq

以下是Dan Saks的文章的存档版本：Archived version。 - Nathan Mills

3

使用 boost::format。它是类型安全的，所以可以正确打印size_t，使用%d，并且在使用时不需要记得在std::string上放置c_str()，即使您将数字传递给%s或反之亦然，也能正常工作。

- Lev

2

你只需要找到一个存储类最大的整数类型，将值转换为它，然后使用相应的格式字符串来处理更大的类型。请注意，此解决方案适用于任何类型（ptrdiff_t等），而不仅仅是size_t。

你想要使用的是uintmax_t和格式宏PRIuMAX。对于Visual C++，你需要下载兼容c99的stdint.h和inttypes.h头文件，因为Microsoft没有提供它们。

另请参见。

http://www.embedded.com/columns/technicalinsights/204700432

这篇文章纠正了Frederico引用的那篇文章中的错误。

- youcantmakemeregister

2

我不知道是否有令人满意的解决方案，但您可以考虑编写一个专门的函数来将 size_t 类型的数据格式化为字符串，并打印该字符串。

（或者，如果可以的话，boost::format 可以轻松地处理此类问题。）

- Head Geek

2

选项1（跨平台使用最佳和最强大的答案）：`inttypes.h` hack，使用`PRIuPTR`类型说明符

由于在大多数（如果不是全部？）系统上，来自inttypes.h的PRIuPTR（无符号指针）printf格式字符串也足够长，可以容纳size_t类型，因此我建议使用以下定义来进行size_t printf格式字符串。

但是，重要的是您验证这将适用于您特定的架构（编译器、硬件等），因为C和C++语言标准不会强制执行此操作。

#include <inttypes.h>

// Custom printf format strings for `size_t` variable types, which are nearly
// always the same size as pointers on any given architecture (though this is
// NOT enforced by the standard!)
//
// `size_t` is an unsigned decimal integer (ex: usually 8 bytes on a 64-bit
// system, 4 bytes on a 32-bit system, or 2 bytes on an 8-bit system) so you
// can simply print it as though it was a pointer!:
#define PRIuSZT PRIuPTR  // u = unsigned decimal integer
#define PRIxSZT PRIxPTR  // x = unsigned decimal integer in lower-case hex
#define PRIXSZT PRIXPTR  // X = unsigned decimal integer in upper-case hex
#define PRIoSZT PRIoPTR  // o = unsigned decimal integer in octal

// The above representations make the most sense. Other representations are
// below, though these will interpret the `size_t` type as though it was
// signed, which doesn't make much sense, since it's not:
#define PRIdSZT PRIdPTR  // d = signed decimal integer
#define PRIiSZT PRIiPTR  // i = signed decimal integer

// For `ssize_t` (signed size_t) types, however, the d and i specifiers *do*
// make sense, so you could do this:
#define PRIdSSZT PRIdPTR  // d = signed decimal integer
#define PRIiSSZT PRIiPTR  // i = signed decimal integer

使用上述定义的示例，以不同的表示方式打印`size_t`类型：

size_t my_variable = 123456789;

// print the `size_t` type as an unsigned decimal integer
printf("%" PRIuSZT "\n", my_variable); 
// print the `size_t` type as an unsigned decimal integer in lower-case hex
printf("0x%" PRIxSZT "\n", my_variable); 
// print the `size_t` type as an unsigned decimal integer in upper-case hex
printf("0X%" PRIXSZT "\n", my_variable); 
// print the `size_t` type as an unsigned decimal integer in octal
printf("0%" PRIoSZT "\n", my_variable); 

// Print the hex values again, this time with leading zero-padding to print a
// full 8-bytes (16 chars) worth of data
printf("\n");
printf("0x%016" PRIxSZT "\n", my_variable); // lower-case unsigned hex 
printf("0X%016" PRIXSZT "\n", my_variable); // upper-case unsigned hex

// Print the octal value again, this time with leading zero-padding to print a
// full 8-bytes (22 chars) worth of data
printf("\n");
printf("0%022" PRIoSZT "\n", my_variable);  // unsigned octal

在64位Linux系统上的示例输出：

123456789
0x75bcd15
0X75BCD15
0726746425

0x00000000075bcd15
0X00000000075BCD15

00000000000000726746425

选项二：在支持的情况下使用`%zu`

然而，在某些系统上，例如使用gcc作为编译器的STM32微控制器，%z长度说明符并没有被实现，像printf("%zu\n", my_size_t_num);这样的操作可能只会打印出一个字面值%zu（我亲自测试过，发现是真的），而不是你的size_t变量的值。

然而，在可能的情况下，只需使用%zu "z"长度说明符（如此处所示），用于size_t类型：

使用示例：

size_t my_variable = 123456789;

printf("%zu\n", my_variable);

选项3（几乎保证在所有现有架构上都能正常工作，因为`uint64_t`足够大，可以容纳所有已知的`size_t`类型）：使用`inttypes.h`中的`PRIu64`说明符，并在打印之前将您的`size_t`变量转换为`uint64_t`

如果您需要它在所有架构上都能够正常工作，或者您不确定您的特定架构，请将其强制转换并打印为uint64_t，这几乎可以保证在所有系统上都能正常工作，但需要额外的强制转换步骤，并且可能会在较小的架构上不必要地转换为比所需更大的大小（从而需要更多的处理器指令）。

示例用法：

#include <stdint.h>    // for uint64_t
#include <inttypes.h>  // for PRIu64

size_t my_variable = 123456789;

// print the `size_t` type as an unsigned decimal integer
printf("%" PRIu64 "\n", (uint64_t)my_variable);
// print the `size_t` type as an unsigned decimal integer in lower-case hex
printf("0x%" PRIx64 "\n", (uint64_t)my_variable);
// print the `size_t` type as an unsigned decimal integer in upper-case hex
printf("0X%" PRIX64 "\n", (uint64_t)my_variable);

// etc etc. See the Option 1 examples and follow the same patterns.

演示以上所有3个选项的全面示例程序

来自我的eRCaGuy_hello_world存储库：

print_size_t.c（在C和C++中都可以完美运行）：

#include <stdio.h>
#include <inttypes.h>

// Custom printf format strings for `size_t` variable types, which are nearly
// always the same size as pointers on any given architecture (though this is
// NOT enforced by the standard!)
//
// `size_t` is an unsigned decimal integer (ex: usually 8 bytes on a 64-bit
// system, 4 bytes on a 32-bit system, or 2 bytes on an 8-bit system) so you
// can simply print it as though it was a pointer!:
#define PRIuSZT PRIuPTR  // u = unsigned decimal integer
#define PRIxSZT PRIxPTR  // x = unsigned decimal integer in lower-case hex
#define PRIXSZT PRIXPTR  // X = unsigned decimal integer in upper-case hex
#define PRIoSZT PRIoPTR  // o = unsigned decimal integer in octal

// The above representations make the most sense. Other representations are
// below, though these will interpret the `size_t` type as though it was
// signed, which doesn't make much sense, since it's not:
#define PRIdSZT PRIdPTR  // d = signed decimal integer
#define PRIiSZT PRIiPTR  // i = signed decimal integer

// For `ssize_t` (signed size_t) types, however, the d and i specifiers *do*
// make sense, so you could do this:
#define PRIdSSZT PRIdPTR  // d = signed decimal integer
#define PRIiSSZT PRIiPTR  // i = signed decimal integer


int main()
{
    printf("Hello World\n");

    size_t my_variable = 123456789;
    printf("sizeof(my_variable) = %" PRIuSZT "\n", sizeof(my_variable));

    // -------------------------------------------------------------------------
    // Option 1 (best and most-robust answer for cross-platform usage):
    // `inttypes.h` hack using `PRIuPTR` type specifiers
    // -------------------------------------------------------------------------
    printf("\n===== Option 1 =====\n");

    // print the `size_t` type as an unsigned decimal integer
    printf("%" PRIuSZT "\n", my_variable);
    // print the `size_t` type as an unsigned decimal integer in lower-case hex
    printf("0x%" PRIxSZT "\n", my_variable);
    // print the `size_t` type as an unsigned decimal integer in upper-case hex
    printf("0X%" PRIXSZT "\n", my_variable);
    // print the `size_t` type as an unsigned decimal integer in octal
    printf("0%" PRIoSZT "\n", my_variable);

    // Print the hex values again, this time with leading zero-padding to print
    // a full 8-bytes (16 chars) worth of data
    printf("\n");
    printf("0x%016" PRIxSZT "\n", my_variable); // lower-case unsigned hex
    printf("0X%016" PRIXSZT "\n", my_variable); // upper-case unsigned hex

    // Print the octal value again, this time with leading zero-padding to print
    // a full 8-bytes (22 chars) worth of data
    printf("\n");
    printf("0%022" PRIoSZT "\n", my_variable);  // unsigned octal


    // -------------------------------------------------------------------------
    // Option 2: use `%zu` where it is supported
    // -------------------------------------------------------------------------
    printf("\n===== Option 2 =====\n");

    printf("%zu\n", my_variable);


    // -------------------------------------------------------------------------
    // Option 3 (virtually guaranteed to work since `uint64_t` is large enough
    // to hold all known `size_t` types on all existing architectures): use the
    // `inttypes.h` `PRIu64` specifier, with the addition of casting your
    // `size_t` variable to `uint64_t` before printing
    // -------------------------------------------------------------------------
    printf("\n===== Option 3 =====\n");

    // print the `size_t` type as an unsigned decimal integer
    printf("%" PRIu64 "\n", (uint64_t)my_variable);
    // print the `size_t` type as an unsigned decimal integer in lower-case hex
    printf("0x%" PRIx64 "\n", (uint64_t)my_variable);
    // print the `size_t` type as an unsigned decimal integer in upper-case hex
    printf("0X%" PRIX64 "\n", (uint64_t)my_variable);

    // etc etc. See the Option 1 examples and follow the same patterns.


    return 0;
}

示例运行和输出：

eRCaGuy_hello_world/c$ gcc -Wall -Wextra -Werror -O3 -std=gnu17 print_size_t.c -o bin/a -lm && bin/a
Hello World
sizeof(my_variable) = 8

===== Option 1 =====
123456789
0x75bcd15
0X75BCD15
0726746425

0x00000000075bcd15
0X00000000075BCD15

00000000000000726746425

===== Option 2 =====
123456789

===== Option 3 =====
123456789
0x75bcd15
0X75BCD15

引用来源：

- Gabriel Staples

1

size_t是一个至少有16位的无符号类型。32位和64位的宽度经常被使用。

printf("%zu\n", some_size_t_object); // Standard since C99

上述方法是未来的最佳选择，但如果代码还需要移植到C99之前的平台，则将值转换为某些宽类型。 unsigned long 是一个合理的选择，但可能不足够。

// OK, yet insufficient with large sizes > ULONG_MAX
printf("%lu\n", (unsigned long) some_size_t_object);

或者使用条件代码。

#ifdef ULLONG_MAX
  printf("%llu\n", (unsigned long long) some_size_t_object); 
#else
  printf("%lu\n", (unsigned long) some_size_t_object); 
#endif

最后考虑一下 double 。它有点低效，但应该处理所有古老和新平台，直到2030-2040年左右考虑摩尔定律，届时 double 可能会缺乏精确结果。

printf("%.0f\n", (double) some_size_t_object);

- chux - Reinstate Monica

@rubenvb 典型的double类型精度有53位，因此会在约9000万GB处产生间隙，而不是32位所建议的。即使是严谨的C语言规范也允许33+位的精度。请问您的32位精度来自何处？ - chux - Reinstate Monica

32来自于没有介于32和64之间的整数大小，正如你所说，64位对于double来说太多了。因此，是的，在使用这个来表示大小时可能不常见出现间隙，但问题（标题）在于如何打印size_t。使用double并不是一个好的通用解决方案，也不会有良好的性能表现... - rubenvb

@rubenvb，顺便说一下，我曾经使用过(u)int48_t，当然这不是一个常见的类型。为了打印对象的大小，53位足以满足未来十年的需求，并且使用double对于这个目的来说是可移植的。但是根据你的观点，OP确实说“打印一些size_t类型的变量”，因此由于各种计算，这些值可能超过单个对象的大小。因此，这并不符合OP的精确目标。性能问题最多只是轻微的，最坏的情况是简单的过早优化。OP的目标是跨平台，而不是速度。好奇的是 - 你认为什么解决方案最适合跨平台？ - chux - Reinstate Monica

使用一个良好的支持适当printf格式字符串/宏的C99编译器/库。如果没有，就自己定义。实际上，是Visual Studio使这更加困难。 - rubenvb

1

@rubenvb 好的 - 我认为跨平台标签也需要一个 C89 - C11 的解决方案 - 而不需要强制编译器进行更改。同意使用最新的兼容编译器可以解决这个问题。感谢反馈，我的大部分都同意。 - chux - Reinstate Monica

显示剩余2条评论

1

对于这个问题，我的选择是将size_t参数简单地转换为unsigned long，并在所有地方使用%lu - 当然，这仅适用于不希望超过2^32-1的值。如果这对您来说太短了，您可以始终将其转换为unsigned long long并将其格式化为%llu。

无论哪种方式，您的字符串都不会显得尴尬。

- paniq

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- finnw · Accepted Answer

24

PRIuPTR宏（来自<inttypes.h>）为uintptr_t定义了一个十进制格式，该类型应始终足够大，以便您可以将size_t强制转换为它而不会截断，例如：

fprintf(stream, "Your size_t var has value %" PRIuPTR ".", (uintptr_t) your_var);

- finnw

finnw，你最好将你的PRIuPTR更改为PRIuPTR，并在你的“fprintf”行前加上4个空格，这样它们就会被格式化为代码，并且不会混淆PRIuPTR和PRluPTR（如此处所示）。 - tzot

@ΤΖΩΤΖΙΟY，我添加了前缀，但是PRIuPTR周围的反引号有什么作用？我看不出任何区别。 - finnw

反引号似乎无法在注释中使用。不知道为什么。同时，注释中也没有预览。 - Rhythmic Fistman

4

如果是C ++，在包含<inttypes.h>之前不要忘记定义__STDC_FORMAT_MACROS。 - Sergey Shandar

Visual Studio 2013 现在支持 inttypes.h。 - BSalita

显示剩余2条评论

跨平台的格式化字符串如何处理size_t类型的变量？

选项1（跨平台使用最佳和最强大的答案）：inttypes.h hack，使用PRIuPTR类型说明符

使用上述定义的示例，以不同的表示方式打印size_t类型：

选项二：在支持的情况下使用%zu

选项3（几乎保证在所有现有架构上都能正常工作，因为uint64_t足够大，可以容纳所有已知的size_t类型）：使用inttypes.h中的PRIu64说明符，并在打印之前将您的size_t变量转换为uint64_t

演示以上所有3个选项的全面示例程序

引用来源：

选项1（跨平台使用最佳和最强大的答案）：`inttypes.h` hack，使用`PRIuPTR`类型说明符

使用上述定义的示例，以不同的表示方式打印`size_t`类型：

选项二：在支持的情况下使用`%zu`

选项3（几乎保证在所有现有架构上都能正常工作，因为`uint64_t`足够大，可以容纳所有已知的`size_t`类型）：使用`inttypes.h`中的`PRIu64`说明符，并在打印之前将您的`size_t`变量转换为`uint64_t`