如何将Python中的双重UTF-8解码器代码翻译成Lua

Question

如何将Python中的双重UTF-8解码器代码翻译成Lua

3

我有这段遗留代码，它能够解码双重编码的UTF-8文本并还原为正常的UTF-8：

# Run with python3!
import codecs
import sys
s=codecs.open('doubleutf8.dat', 'r', 'utf-8').read()
sys.stdout.write(
                s
                .encode('raw_unicode_escape')
                .decode('utf-8')
        )

我需要将它翻译成Lua，并模仿所有可能的解码副作用（如果有的话）。

限制：我可以使用任何可用的Lua模块来处理UTF-8，但最好是稳定的，支持LuaRocks。我不会使用Lupa或其他Lua-Python桥接解决方案，也不会调用os.execute()来调用Python。

- Alexander Gladysh

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Michal Kottman · Accepted Answer

你可以使用lua-iconv，它是与iconv库绑定的Lua程序。使用它，你可以随意转换字符编码。

它也可以在LuaRocks中获得。编辑：使用这个答案，我已经能够正确地解码数据，使用以下Lua代码：

require 'iconv'
-- convert from utf8 to latin1
local decoder = iconv.new('latin1', 'utf8')
local data = io.open('doubleutf8.dat'):read('*a')
-- decodedData is encoded in utf8
local decodedData = decoder:iconv(data)
-- if your terminal understands utf8, prints "нижний новгород"
-- if not, you can further convert it from utf8 to any encoding, like KOI8-R
print(decodedData)