如何在Java中将URL的HTML内容转换成字符串

Question

如何在Java中将URL的HTML内容转换成字符串

javahtmlurlfile-io

10

我在服务器上存储了一个html文件，该文件的URL路径类似于：<https://localhost:9443/genesis/Receipt/Receipt.html>

我想从url读取该html文件的内容，其中包含标签和源代码。

我该如何做呢？由于这是服务器端代码，无法使用浏览器对象，并且我不确定使用URLConnection是否是一个好选择。

现在应该采用什么最佳解决方案？

- Peyush Goel

String content = org.apache.commons.io.IOUtils.toString(new Url("https://localhost:9443/genesis/Receipt/Receipt.html"), "utf8");

- Klitos Kyriacou

5个回答

3

使用Spring解决了这个问题，将bean添加到Spring配置文件中。

  <bean id = "receiptTemplate" class="org.springframework.core.io.ClassPathResource">
    <constructor-arg value="/WEB-INF/Receipt/Receipt.html"></constructor-arg>
  </bean>

然后在我的方法中阅读它

        // read the file into a resource
        ClassPathResource fileResource =
            (ClassPathResource)context.getApplicationContext().getBean("receiptTemplate");
        BufferedReader br = new BufferedReader(new FileReader(fileResource.getFile()));
        String line;
        StringBuffer sb =
            new StringBuffer();

        // read contents line by line and store in the string
        while ((line =
            br.readLine()) != null) {
            sb.append(line);
        }
        br.close();
        return sb.toString();

- Peyush Goel

1

import java.net.*;
import java.io.*;

//...

URL url = new URL("https://localhost:9443/genesis/Receipt/Receipt.html");
url.openConnection();
InputStream reader = url.openStream();

- Chris Dargis

0

例如：

        URL url = new URL("https://localhost:9443/genesis/Receipt/Receipt.html");
        URLConnection con = url.openConnection();
        BufferedReader in = new BufferedReader(new InputStreamReader(con.getInputStream()));
        String l;
        while ((l=in.readLine())!=null) {
            System.out.println(l);
        }

您可以以其他方式使用输入流，而不仅仅是将其打印出来。

当然，如果您有本地文件的路径，也可以这样做

  InputStream in = new FileInputStream(new File(yourPath));

- Denys Séguret

1

如果您知道本地文件的位置，也可以使用FileInputStream（请参见扩展答案）。 - Denys Séguret

文件位于Web服务器上，我正在处理的代码也将部署在Web服务器上。身份验证和连接管理不会成为问题。唯一的问题是我无法读取内容。我正在尝试使用URLConnection方法，但我不确定它是否是最佳方法，虽然我对此也不太确定。 - Peyush Goel

如果文件是本地的，只要你有完整的路径就可以使用 FileInputStream。 - Chris Dargis

1

你需要知道你的HTML文件的根目录，这样才能使用FileInputStream。 - Denys Séguret

1

从URL中，您可以使用URLConnection读取它。但更建议确定您的文件在哪里，并使用FileInputStream读取它。 - Denys Séguret

显示剩余6条评论

0

在我看来，最简单的方法是使用IOUtils

import com.amazonaws.util.IOUtils;
...

String uri = "https://localhost:9443/genesis/Receipt/Receipt.html";
String fileContents = IOUtils.toString(new URL(uri).openStream());
System.out.println(fileContents);

- Haris Bouchlis

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Vaibs · Accepted Answer

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.MalformedURLException;
import java.net.URL;
import java.net.URLConnection;
 
public class URLContent {
    public static void main(String[] args) {
        try {
            // get URL content
            
            String a = "http://localhost:8080//TestWeb/index.jsp";
            URL url = new URL(a);
            URLConnection conn = url.openConnection();
 
            // open the stream and put it into BufferedReader
            BufferedReader br = new BufferedReader(
                               new InputStreamReader(conn.getInputStream()));
 
            String inputLine;
            while ((inputLine = br.readLine()) != null) {
                System.out.println(inputLine);
            }
            br.close();
 
            System.out.println("Done");

        } catch (MalformedURLException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}