使用 Retrofit 在 Android 中获取网站的 HTML?

9

如何使用 retrofit 获取网站的HTML代码?

例如,我有这个链接,需要获取该链接的HTML代码,并且如何加载更多内容。

以下是我的代码:

MainActivity.java:

public class MainActivity extends AppCompatActivity {
    TextView txt;

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);
        txt = (TextView) findViewById(R.id.txt);

        OkHttpClient okHttpClient = new OkHttpClient().newBuilder()
                .build();

        Retrofit retrofit = new Retrofit.Builder()
                .baseUrl("https://www.instagram.com/elde0596/")
                .addConverterFactory(GsonConverterFactory.create())
                .client(okHttpClient)
                .build();

        final Interface_Web request = retrofit.create(Interface_Web.class);
        Call<ResponseBody> call = request.getHtml();
        call.enqueue(new Callback<ResponseBody>() {
            @Override
            public void onResponse(Call<ResponseBody> call, Response<ResponseBody> response) {
                txt.setText(response.body().source().toString());
                Log.i("SDADASDAWEQ", "A " + response.body().toString());
            }

            @Override
            public void onFailure(Call<ResponseBody> call, Throwable t) {
                Log.i("SDADASDAWEQ", "B " + t.getMessage());
            }
        });

    }
}

Interface_Web.java :

public interface Interface_Web {
    @GET("/")
    Call<ResponseBody> getHtml();
}

只需展示给我:

[size=9500 text=<!DOCTYPE html>\n<!--[if lt IE 7]>      <html lang="en" class="no…]

但我需要查看所有的html代码。


请说明您尝试过什么以及您缺少什么,同时请在此处放置一些代码。 - Pratik Vyas
我想获取所有的HTML代码,但这里要求输入标签名称以获取代码片段。以下是我的代码 --> String html = response.body().string(); Document document = Jsoup.parse(html); Elements elements = document.select("你想获取的标签名称"); for (Element element:elements) { if (element.attr("你想检查的属性名称").equals("属性值")){ } } - Ashish Patel
谢谢,这对我很有帮助。请告诉我如何打印整个HTML内容,根据你的解决方案,它只打印了一行而不是整个HTML内容。请告诉我。谢谢! - Gyan Swaroop Awasthi
2个回答

4

解决了我的问题:

public class SecondClass extends AppCompatActivity {
    @Override
    protected void onCreate(@Nullable Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.class_second);
        Dispatcher dispatcher = new Dispatcher(Executors.newFixedThreadPool(20));
        dispatcher.setMaxRequests(20);
        dispatcher.setMaxRequestsPerHost(1);

        OkHttpClient okHttpClient = new OkHttpClient.Builder()
                .dispatcher(dispatcher)
                .connectionPool(new ConnectionPool(100, 30, TimeUnit.SECONDS))
                .build();

        Retrofit retrofit = new Retrofit.Builder()
                .baseUrl(HttpUrl.parse("https://www.x.x/x/"))
                .addConverterFactory(PageAdapter.FACTORY)
                .build();

        PageService requestAddress = retrofit.create(PageService.class);
        Call<Page> pageCall = requestAddress.get(HttpUrl.parse("https://www.x.x/x/"));
        pageCall.enqueue(new Callback<Page>() {
            @Override
            public void onResponse(Call<Page> call, Response<Page> response) {
                Log.i("ADASDASDASD", response.body().content);
            }
            @Override
            public void onFailure(Call<Page> call, Throwable t) {

            }
        });
    }

    static class Page {
        String content;

        Page(String content) {
            this.content = content;
        }
    }

    static final class PageAdapter implements Converter<ResponseBody, SecondClass.Page> {
        static final Converter.Factory FACTORY = new Converter.Factory() {
            @Override
            public Converter<ResponseBody, ?> responseBodyConverter(Type type, Annotation[] annotations, Retrofit retrofit) {
                if (type == SecondClass.Page.class) return new SecondClass.PageAdapter();
                return null;
            }
        };

        @Override
        public SecondClass.Page convert(ResponseBody responseBody) throws IOException {
            Document document = Jsoup.parse(responseBody.string());
            Element value = document.select("script").get(1);
            String content = value.html();
            return new SecondClass.Page(content);
        }
    }

    interface PageService {
        @GET
        Call<SecondClass.Page> get(@Url HttpUrl url);
    }
}

3

我认为这会对你有所帮助。如果你将你的调用对象创建为ResponseBody,你可以获得如下的HTML:

Call<ResponseBody> call = Interface_Web.getJSONSignIn(...)
call.enqueue(new Callback<ResponseBody>() {
    @Override
    public void onResponse(Response<ResponseBody> response, Retrofit retrofit) {
        // access response code with response.code()
        // access string of the response with response.body().string()
    }

    @Override
    public void onFailure(Throwable t) {
        t.printStackTrace();
    }
});

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接