如何在Linux控制台中打印wchar？

Question

如何在Linux控制台中打印wchar？

8

以下是我的C程序。在bash中，程序会打印“char is”，但不会打印“Ω”。我的语言环境都是en_US.utf8。

#include <stdio.h>
#include <wchar.h>
#include <stdlib.h>

int main() {
   int r;
   wchar_t myChar1 = L'Ω';
   r = wprintf(L"char is %c\n", myChar1);
}

- davy

3个回答

6

除了那个建议修复LIBC的答案，你也可以这样做：

#include <stdio.h>
#include <wchar.h>
#include <stdlib.h>

// NOTE: *NOT* thread safe, not re-entrant
const char* unicode_to_utf8(wchar_t c)
{
    static unsigned char b_static[5];
    unsigned char* b = b_static; 

    if (c<(1<<7))// 7 bit Unicode encoded as plain ascii
    {
        *b++ = (unsigned char)(c);
    }
    else if (c<(1<<11))// 11 bit Unicode encoded in 2 UTF-8 bytes
    {
        *b++ = (unsigned char)((c>>6)|0xC0);
        *b++ = (unsigned char)((c&0x3F)|0x80);
    }
    else if (c<(1<<16))// 16 bit Unicode encoded in 3 UTF-8 bytes
        {
        *b++ = (unsigned char)(((c>>12))|0xE0);
        *b++ =  (unsigned char)(((c>>6)&0x3F)|0x80);
        *b++ =  (unsigned char)((c&0x3F)|0x80);
    }

    else if (c<(1<<21))// 21 bit Unicode encoded in 4 UTF-8 bytes
    {
        *b++ = (unsigned char)(((c>>18))|0xF0);
        *b++ = (unsigned char)(((c>>12)&0x3F)|0x80);
        *b++ = (unsigned char)(((c>>6)&0x3F)|0x80);
        *b++ = (unsigned char)((c&0x3F)|0x80);
    }
    *b = '\0';
    return b_static;
}


int main() {
    int r;
    wchar_t myChar1 = L'Ω';
    r = printf("char is %s\n", unicode_to_utf8(myChar1));
    return 0;
}

- RushPL

2

这个答案有些愚蠢；使用wchar_t的唯一目的是在理论上支持不同区域设置中的不同输出编码。如果您想硬编码UTF-8，只需使用char *myChar1 = "Ω";，然后使用printf和％s即可... - R.. GitHub STOP HELPING ICE

我认为我的回答是一种解决方法，或者在某些更有限的用例中可能是一种解决方案。我喜欢被选为解决方案的答案，所以没有争议。干杯。 - RushPL

4

在输出之前使用{glib，libiconv，ICU}将其转换为UTF-8。

- Ignacio Vazquez-Abrams

谢谢。我可以不使用这些库来做吗？ - davy

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- vstm · Accepted Answer

这很有趣。显然编译器将 omega 从 UTF-8 转换为 UNICODE，但是 libc 不知何故将其搞乱了。

首先：格式说明符 %c 需要一个 char（即使在 wprintf 版本中也是如此），因此您必须指定 %lc （对于字符串，还需要使用 %ls）。

其次，如果您以这种方式运行代码，则区域设置将设置为 C（不会自动从环境中获取）。您需要使用空字符串调用 setlocale 来从环境中获取区域设置，这样 libc 就会再次正常工作。

#include <stdio.h>
#include <wchar.h>
#include <stdlib.h>
#include <locale.h>

int main() {
    int r;
    wchar_t myChar1 = L'Ω';
    setlocale(LC_CTYPE, "");
    r = wprintf(L"char is %lc (%x)\n", myChar1, myChar1);
}