Java - SCP后出现“ü/ä/ö”问题

Question

Java - SCP后出现“ü/ä/ö”问题

3

我创建了一个程序，可以加载本地或远程日志文件。如果我加载本地文件，则不会出现错误。但是，如果我首先使用SCP将文件复制到我的本地计算机（在此使用以下代码：http://www.jcraft.com/jsch/examples/ScpFrom.java.html），然后读取文件，我会收到错误并显示为“ü/ä/ö”。如何解决这个问题？

远程：Linux服务器本地：Windows PC

SCP代码:

http://www.jcraft.com/jsch/examples/ScpFrom.java.html

读取代码:

protected void openTempRemoteFile() throws IOException {

        BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream( lfile )));
        String strLine;

        DefaultTableModel dtm = new DefaultTableModel(0, 0);
        String header[] = new String[]{ "Timestamp", "Session-ID", "Log" };
        dtm.setColumnIdentifiers(header);
        table.setModel(dtm);

        while ((strLine = reader.readLine()) != null) {     

            String[] sparts = strLine.split(" ");
            String[] bparts = strLine.split("   : ");

            String Timestamp = sparts[0] + " " + sparts[1];
            String SessionID = sparts[4];
            String Log = bparts[1];

            dtm.addRow(new Object[] {Timestamp, SessionID, Log});
        }
        reader.close();
}

编辑：

本地文件的编码格式：UTF-8

从Linux服务器传输到SCP远程文件的编码格式：WINDOWS-1252

- Drextor

你在本地和远程系统中使用哪种编码？ - Kayaman

这是一个编码错误。涉及哪些系统？ - Thorbjørn Ravn Andersen

远程系统：Ubuntu服务器本地系统：Windows操作系统 - Drextor

3个回答

3

要解决你的问题，你至少有两个选择:

你可以在代码中直接指定文件的编码方式，并按以下方式进行更新:

```html ```

注意：需要将“编码方式”替换为你想要使用的编码方式，例如“UTF-8”。

BufferedReader reader = new BufferedReader(
    new InputStreamReader(
        new FileInputStream( lfile ),
        "UTF8"
    )
);

或者在启动JVM时使用以下命令设置默认文件编码：

java -Dfile.encoding=UTF-8 … com.example.Main

我绝对更喜欢第一种方法，如果需要，您也可以将“UTF8”值参数化。使用后一种方法，如果您忘记指定，仍可能面临相同的问题。

您可以将编码替换为任何您喜欢的编码（请参阅支持编码的https://docs.oracle.com/javase/8/docs/technotes/guides/intl/encoding.doc.html），在Windows上，“Cp1252”通常是默认编码。

请记住，您始终可以使用query the file.encoding属性或Charset.defaultCharset()查找应用程序的当前默认编码，例如：

byte [] byteArray = {'blablabla'};
InputStream inputStream = new ByteArrayInputStream(byteArray);
InputStreamReader reader = new InputStreamReader(inputStream);
String defaultEncoding = reader.getEncoding();

- David

2

处理编码是非常棘手的事情。如果您的系统始终使用这种来自不同环境的文件，则应首先检测字符集，然后使用给定的字符集读取它。我曾遇到类似的问题，我使用了juniversalchardet来检测字符集，并使用InputStreamReader(stream, Charset)。在您的情况下，应该像这样进行：

protected void openTempRemoteFile() throws IOException {
        String encoding = UniversalDetector.detectCharset(lfile);
        BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream( lfile ), Charset.forName(encoding)));
        ....

如果只是一次性的工作，那么请在文本编辑器中打开它（例如notepad ++），然后以您的编码方式保存它。然后在程序中使用它。

- aios

"detectCharset" 给我报错了 - The method detectCharset(File) is undefined for the type UniversalDetector. 我使用的是从 Mavenrepository 下载的 juniversalchardet。 - Drextor

你使用的是哪个版本？它在 com.github.albfernandez:juniversalchardet:2.0.0 中。 - aios

哦，我使用了谷歌的错误版本（1.0.3）。谢谢，这个可以在我的程序中运行 :) - Drextor

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Andrew Sklyarevsky · Accepted Answer

在InputStreamReader构造函数中提供适当的Charset，例如：

import java.nio.charset.StandardCharsets;

...

BufferedReader reader = new BufferedReader(
    new InputStreamReader(
        new FileInputStream( lfile ),
        StandardCharsets.UTF_8)); // try also ISO_8859_1 if UTF_8 doesn't help.