我遇到了一个下载和解析UTF-8网页的问题...我使用下面的函数获取网页HTML:
```
static String getString(String url, ProgressDialog loading) {
String s = "", html = "";
HttpURLConnection conn = null;
try {
conn = (HttpURLConnection) new URL(url).openConnection();
conn.setRequestProperty("Content-Type", "text/plain; charset=utf-8");
conn.setConnectTimeout(5000);
conn.setReadTimeout(5000);
conn.connect();
DataInputStream dis = new DataInputStream(conn.getInputStream());
loading.setTitle("Descargando...");
loading.setMax( 32000 );
while ((s = dis.readLine()) != null) {
html += s;
loading.setProgress(html.length());
}
} catch (Exception e) {
Log.e("CC", "Error al descargar: " + e.getMessage());
} finally {
if (conn != null)
conn.disconnect();
}
return html;
}
这个网页包含:
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />
但是在应用程序中,西班牙语的元素如:¡ ¿ á é í ó ú 显示错误。我尝试使用 readUTF(),但遇到了长度问题...
有什么建议吗?谢谢!
Element#text()
。但是,每当您要在HTML页面中重新显示它时,Element#html()
应该完美地工作。 - BalusC