如何在我的C++程序中使用下标数字？

Question

如何在我的C++程序中使用下标数字？

7

我正在编写一个涉及较多数学计算的C++程序。因此，我试图在类成员变量的wstring中标记一些对象为具有下标数字。然而，任何形式的对这些字符的存储尝试都会将它们强制转换为非下标字符。相比之下，在代码中直接使用复制的字符则被正确维护了。以下是我进行实验的几种情况：

setlocale(LC_ALL, "");
wchar_t txt = L'\u2080';
wcout << txt << endl;
myfile << txt << endl;

这将在文件和控制台输出“0”。

setlocale(LC_ALL, "");
wcout << L"x₀₁" << endl;
myfile << L"x₀₁" << endl;

这会将“x01”输出到文件和控制台。

setlocale(LC_ALL, "");
wcout << "x₀₁" << endl;
myfile << "x₀₁" << endl;

这会输出"xâ'?â'?"到控制台，如果可能的话，我希望避免这种情况，而将"x₀₁"写入文件中则是我想要的结果。理想的程序状态应该可以同时向文件和控制台输出，但如果不可能，则更倾向于在控制台上打印非下标字符。

我的代码旨在将整数转换为它们相应的下标字符。如何尽可能顺畅地处理这些字符而不被转换回来？我怀疑字符编码起了作用，但我不知道如何将Unicode编码融入到我的程序中。

- Michael Luger

3

字符/字符串编码和格式问题并不容易解决。祝玩得愉快。 - Jesper Juhl

在C++源代码中将非ASCII字符放置在双引号内是有风险的。你可能能够在键盘上输入并放置在双引号之间，但在运行时得到的可能不是你想要的结果。 - PaulMcKenzie

如果使用VStudio，Unicode是您项目设置中“字符集”字段的一个选项。另一个可能性是您使用的终端不支持Unicode，这将使您无法获得所需的结果。可能这个问题是一个重复。 - alteredinstance

1

尝试使用UTF16BE（大端）。看起来下标字符是UTF-16，而不是32。我的参考资料。 - alteredinstance

结果表明，涉及到读取源文件的设置似乎并没有太大帮助。 - Michael Luger

显示剩余3条评论

2个回答

0

您应该配置控制台和打开文件的程序，以便它将您的字符串解释为其编码（例如utf32）。

例如，在Windows中，您可以使用SetConsoleOutputCP函数设置控制台代码页。要查看不同编码的文件，您可以将文件添加到vs解决方案中，右键单击/使用编码打开源代码（文本），然后选择您的编码。

- idris

看起来这个字符是UTF-16，只需要做一个小修正。 - alteredinstance

我的注册表不幸没有UTF-8或-16的代码页。 - Michael Luger

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Ted Lyngmo · Accepted Answer

我发现这些事情很棘手，我从来不确定它是否适用于每个 Windows 版本和语言环境，但对我来说这样做很管用：

#include <Windows.h>
#include <io.h>     // _setmode
#include <fcntl.h>  // _O_U16TEXT

#include <clocale>  // std::setlocale 
#include <iostream>

// Unicode UTF-16, little endian byte order (BMP of ISO 10646)
constexpr char CP_UTF_16LE[] = ".1200";

constexpr wchar_t superscript(int v) {
    constexpr wchar_t offset = 0x2070;       // superscript zero as offset
    if (v == 1) return 0x00B9;               // special case
    if (v == 2 || v == 3) return 0x00B0 + v; // special case 2
    return offset + v;
}

constexpr wchar_t subscript(int v) {
    constexpr wchar_t offset = 0x2080; // subscript zero as offset
    return offset + v;
}

int main() {
    // set these before doing any other output:
    setlocale(LC_ALL, CP_UTF_16LE);
    _setmode(_fileno(stdout), _O_U16TEXT);

    // subscript
    for (int i = 0; i < 10; ++i)
        std::wcout << L'X' << subscript(i) << L' ';
    std::wcout << L'\n';

    // superscript
    for (int i = 0; i < 10; ++i)
        std::wcout << L'X' << superscript(i) << L' ';
    std::wcout << L'\n';    
}

输出：

X₀ X₁ X₂ X₃ X₄ X₅ X₆ X₇ X₈ X₉
X⁰ X¹ X² X³ X⁴ X⁵ X⁶ X⁷ X⁸ X⁹

更方便的方法可能是直接创建wstring。这里，wsup和wsub接受一个wstring并返回一个转换后的wstring。它们无法处理的字符将保持不变。

#include <Windows.h>
#include <io.h>      // _setmode
#include <fcntl.h>   // _O_U16TEXT

#include <algorithm> // std::transform
#include <clocale>   // std::setlocale 
#include <iostream>

// Unicode UTF-16, little endian byte order (BMP of ISO 10646)
constexpr char CP_UTF_16LE[] = ".1200";

std::wstring wsup(const std::wstring& in) {
    std::wstring rv = in;

    std::transform(rv.begin(), rv.end(), rv.begin(),
        [](wchar_t ch) -> wchar_t {
            // 1, 2 and 3 can be put in any order you like
            // as long as you keep them in the top section
            if (ch == L'1') return 0x00B9;
            if (ch == L'2') return 0x00B2;
            if (ch == L'3') return 0x00B3;

            // ...but this must be here in the middle:
            if (ch >= '0' && ch <= '9') return 0x2070 + (ch - L'0');

            // put the below in any order you like,
            // in the bottom section
            if (ch == L'i') return 0x2071;
            if (ch == L'+') return 0x207A;
            if (ch == L'-') return 0x207B;
            if (ch == L'=') return 0x207C;
            if (ch == L'(') return 0x207D;
            if (ch == L')') return 0x207E;
            if (ch == L'n') return 0x207F;

            return ch; // no change
        });
    return rv;
}

std::wstring wsub(const std::wstring& in) {
    std::wstring rv = in;

    std::transform(rv.begin(), rv.end(), rv.begin(),
        [](wchar_t ch) -> wchar_t {
            if (ch >= '0' && ch <= '9') return 0x2080 + (ch - L'0');
            if (ch == L'+') return 0x208A;
            if (ch == L'-') return 0x208B;
            if (ch == L'=') return 0x208C;
            if (ch == L'(') return 0x208D;
            if (ch == L')') return 0x208E;
            if (ch == L'a') return 0x2090;
            if (ch == L'e') return 0x2091;
            if (ch == L'o') return 0x2092;
            if (ch == L'x') return 0x2093;
            if (ch == 0x0259) return 0x2094; // small letter schwa: ə
            if (ch == L'h') return 0x2095;
            if (ch >= 'k' && ch <= 'n') return 0x2096 + (ch - 'k');
            if (ch == L'p') return 0x209A;
            if (ch == L's') return 0x209B;
            if (ch == L't') return 0x209C;

            return ch; // no change
        });
    return rv;
}

int main() {
    std::setlocale(LC_ALL, CP_UTF_16LE);
    if (_setmode(_fileno(stdout), _O_U16TEXT) == -1) return 1;

    auto pstr = wsup(L"0123456789 +-=() ni");
    auto bstr = wsub(L"0123456789 +-=() aeoxə hklmnpst");

    std::wcout << L"superscript:   " << pstr << L'\n';
    std::wcout << L"subscript:     " << bstr << L'\n';

    std::wcout << L"an expression: x" << wsup(L"(n-1)") << L'\n';
}

输出：

superscript:   ⁰¹²³⁴⁵⁶⁷⁸⁹ ⁺⁻⁼⁽⁾ ⁿⁱ
subscript:     ₀₁₂₃₄₅₆₇₈₉ ₊₋₌₍₎ ₐₑₒₓₔ ₕₖₗₘₙₚₛₜ
an expression: x⁽ⁿ⁻¹⁾

我的控制台无法显示hklmnpst的下标版本 - 但显然转换是正确的，因为在复制/粘贴后它在这里正常显示。