C++无法使用"wcout"输出Unicode,同时保留"cout"的正常工作

10

无法在多个代码页中打印Unicode字符串,并保持“cout”工作。请帮助我使这些3行代码一同工作。

std::wcout<<"abc "<<L'\u240d'<<" defg "<<L'א'<<" hijk"<<std::endl;
std::cout<<"hello world from cout! \n";
std::wcout<<"hello world from wcout! \n";

输出:

abc hello world from cout!

我尝试过:

#include <io.h> 
#include <fcntl.h>
_setmode(_fileno(stdout), _O_U8TEXT);

问题:

"wcout"失败了。

尝试:

std::locale mylocale("");
std::wcout.imbue(mylocale);

并且:

SetConsoleOutputCP(1251);

并且

setlocale(LC_ALL, "");

并且

SetConsoleCP(CP_UTF8)

什么都没有起作用。


@Deduplicator:我不明白这个问题如何仅限于Windows。这是C++和C标准的事实。 - Lightness Races in Orbit
@LightnessRacesinOrbit:你是对的,抱歉。由于使用了一些仅适用于Windows的功能,我射击得太快了。此外,它们在Unicode方面也有额外的麻烦。 - Deduplicator
1
与Windows相关:https://web.archive.org/web/20111005003105/http://blogs.msdn.com/b/michkap/archive/2010/10/07/10072032.aspx - Mgetz
@LightnessRacesinOrbit 这不是完整的解决方案,但至少在打印Unicode后不会冻结cout\wcout光标: std::locale mylocale(""); std::wcout.imbue(mylocale); - user1438233
3个回答

11

使用_setmode()函数在使用wcoutwcin之前需要进行一些非标准设置。此示例代码过于冗长,不够清晰,但可在clang++、g++和MSVC++上运行。

#include <iostream>
#include <locale>
#include <locale.h>
#include <stdlib.h>

#ifndef MS_STDLIB_BUGS // Allow overriding the autodetection.
/* The Microsoft C and C++ runtime libraries that ship with Visual Studio, as
 * of 2017, have a bug that neither stdio, iostreams or wide iostreams can
 * handle Unicode input or output.  Windows needs some non-standard magic to
 * work around that.  This includes programs compiled with MinGW and Clang
 * for the win32 and win64 targets.
 */
#  if ( _MSC_VER || __MINGW32__ || __MSVCRT__ )
    /* This code is being compiled either on MS Visual C++, or MinGW, or
     * clang++ in compatibility mode for either, or is being linked to the
     * msvcrt.dll runtime.
     */
#    define MS_STDLIB_BUGS 1
#  else
#    define MS_STDLIB_BUGS 0
#  endif
#endif

#if MS_STDLIB_BUGS
#  include <io.h>
#  include <fcntl.h>
#endif

#if !HAS_APP17_FILESYSTEM && !HAS_TS_FILESYSTEM && __has_include(<filesystem>)
#  include <filesystem> /* MSVC has this header, but not the standard API. */
#  if __cpp_lib_filesystem >= 201703
#    define HAS_CPP17_FILESYSTEM 1
#  endif
#endif

#if !HAS_CPP17_FILESYSTEM && __has_include(<experimental/filesystem>)
#  include <experimental/filesystem>
/* Microsoft screws this one up, too, by not defining the feature-test
 * macro specified by the standard.
 */
#  if __cpp_lib_experimental_filesystem >= 201406 || MS_STDLIB_BUGS
#    define HAS_TS_FILESYSTEM 1
/* With g++6, this requires -lstdc++fs, AFTER this source file on the
 * command line.
 */
#  endif
#endif

#if HAS_CPP17_FILESYSTEM
  using std::filesystem::absolute;
  using std::filesystem::current_path;
  using std::filesystem::directory_entry;
  using std::filesystem::directory_iterator;
  using std::filesystem::is_directory;
  using std::filesystem::exists;
  using std::filesystem::path;
#elif HAS_TS_FILESYSTEM
  using std::experimental::filesystem::absolute;
  using std::experimental::filesystem::current_path;
  using std::experimental::filesystem::directory_entry;
  using std::experimental::filesystem::directory_iterator;
  using std::experimental::filesystem::is_directory;
  using std::experimental::filesystem::exists;
  using std::experimental::filesystem::path;
#else
#  error "This library has neither <filesystem> nor <experimental/filesystem>."
#endif

void init_locale(void)
// Does magic so that wcout can work.
{
#if MS_STDLIB_BUGS
  // Windows needs a little non-standard magic.
  constexpr char cp_utf16le[] = ".1200"; // UTF-16 little-endian locale.
  setlocale( LC_ALL, cp_utf16le );
  _setmode( _fileno(stdout), _O_WTEXT );
  /* Repeat for _fileno(stdin), if needed. */
#else
  // The correct locale name may vary by OS, e.g., "en_US.utf8".
  constexpr char locale_name[] = "";
  setlocale( LC_ALL, locale_name );
  std::locale::global(std::locale(locale_name));
  std::wcin.imbue(std::locale())
  std::wcout.imbue(std::locale());
#endif
}

using std::endl;

int main( const int argc, const char * const argv[] )
{
  init_locale();

  const path cwd = (argc > 1) ? absolute(path( argv[1], std::locale() ))
                              : absolute(current_path());

  if (exists(cwd)) {
    std::wcout << cwd.wstring() << endl;
  } else {
    std::wcerr << "Path does not exist.\n";
    return EXIT_FAILURE;
  }

  if (is_directory(cwd)) {
    for ( const directory_entry &f : directory_iterator(cwd) )
      std::wcout << f.path().filename().wstring() << endl;
  }

  return EXIT_SUCCESS;
}

这可能比实际需要的要复杂得多:std::filesystem自2018年起就不被支持了,但是<experimental/filesystem>将永远不会被删除。

这里有一个简化版本,仅包含启用wcout的样板代码:

#include <iostream>
#include <locale>
#include <locale.h>

#ifndef MS_STDLIB_BUGS
#  if ( _MSC_VER || __MINGW32__ || __MSVCRT__ )
#    define MS_STDLIB_BUGS 1
#  else
#    define MS_STDLIB_BUGS 0
#  endif
#endif

#if MS_STDLIB_BUGS
#  include <io.h>
#  include <fcntl.h>
#endif

void init_locale(void)
{
#if MS_STDLIB_BUGS
  constexpr char cp_utf16le[] = ".1200";
  setlocale( LC_ALL, cp_utf16le );
  _setmode( _fileno(stdout), _O_WTEXT );
#else
  // The correct locale name may vary by OS, e.g., "en_US.utf8".
  constexpr char locale_name[] = "";
  setlocale( LC_ALL, locale_name );
  std::locale::global(std::locale(locale_name));
  std::wcin.imbue(std::locale())
  std::wcout.imbue(std::locale());
#endif
}

10

C++说:

[C++11: 27.4.1/3]: 在宽字符流和窄字符流之间混合操作的语义与在FILE上混合此类操作的语义相同,如ISO C标准的修订版1中所指定的。

引用的文档中表示:

流的定义被更改以包含文本流和二进制流的方向概念。将流与文件关联后,在对流执行任何操作之前,该流没有方向。如果将宽字符输入或输出函数应用于没有方向的流,则该流变为宽定向流。同样,如果对具有方向的流应用字节输入或输出操作,则该流变为字节定向流。此后,只有fwide()freopen()函数可以改变流的方向。

不得将字节输入/输出函数应用于宽定向流,也不得将宽字符输入/输出函数应用于字节定向流。

根据我的解释,简而言之,意思是不要混合使用std::coutstd::wcout


2
因为Unicode在代码页中无法表示,导致wcout失败。
std::wcout<<"abc "<<L'\u240d'<<" defg "<<L'א'<<" hijk"<<std::endl;
if(std::wcout.fail()){
    std::cout<<"\nConversion didn't succeed\n";
    std::wcout << "This statement has no effect on the console";
    std::wcout.clear();
    std::wcout<<"hello world from wcout! \n";
}
std::cout<<"hello world from cout! \n";
std::wcout<<"hello world from wcout again! \n";

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接