无法将PDF文件作为二进制数据获取

Question

无法将PDF文件作为二进制数据获取

javaandroidhttp-headershttprequesthttpurlconnection

7

我正试图从以下地址获取一个PDF文件:

URL : https://domain_name/xyz/_id/download/

其中它并不直接指向一个pdf文件，每个唯一的文件都需要解释特定的<_id>字段。

我将此链接放入浏览器的地址栏中，Pdf文件会立即下载，但是当我尝试通过HTTPSURLConnection获取时，它的Content-Type以'text/html'形式出现，而应该是'application/pdf'。

我还尝试在连接之前使用'setRequestProperty'设置为'application/pdf'，但文件始终以'text/html'形式下载。

我正在使用的方法是'GET'

1）是否需要使用HttpClient而不是HttpsURLConnection？

2）这些类型的链接是否用于增加安全性？

3）请指出我的错误。

4）如何知道服务器上存在的文件名？

以下是我实施的主要代码：

    URL url = new URL(sb.toString());

    //created new connection
    HttpsURLConnection urlConnection = (HttpsURLConnection) url.openConnection();

    //have set the request method and property
    urlConnection.setRequestMethod("GET");
    urlConnection.setDoOutput(true);
    urlConnection.setRequestProperty("Content-Type", "application/pdf");

    Log.e("Content Type--->", urlConnection.getContentType()+"   "+ urlConnection.getResponseCode()+"  "+ urlConnection.getResponseMessage()+"              "+urlConnection.getHeaderField("Content-Type"));

    //and connecting!
    urlConnection.connect();

    //setting the path where we want to save the file
    //in this case, going to save it on the root directory of the
    //sd card.
    File SDCardRoot = Environment.getExternalStorageDirectory();

    //created a new file, specifying the path, and the filename

    File file = new File(SDCardRoot,"example.pdf");

    if((Environment.getExternalStorageState()).equals(Environment.MEDIA_MOUNTED_READ_ONLY))

    //writing the downloaded data into the file we created
    FileOutputStream fileOutput = new FileOutputStream(file);

    //this will be used in reading the data from the internet
    InputStream inputStream = urlConnection.getInputStream();

    //this is the total size of the file
    int totalSize = urlConnection.getContentLength();

    //variable to store total downloaded bytes
    Log.e("Total File Size ---->", ""+totalSize);
    int downloadedSize = 0;

    //create a buffer...
    byte[] buffer = new byte[1024];
    int bufferLength = 0; //used to store a temporary size of the buffer

    //Reading through the input buffer and write the contents to the file
    while ( (bufferLength = inputStream.read(buffer)) > 0 ) {

        //add the data in the buffer to the file in the file output stream (the file on the sd card
        fileOutput.write(buffer, 0, bufferLength);


        //adding up the size
        downloadedSize += bufferLength;

        //reporting the progress:
        Log.e("This much downloaded---->",""+ downloadedSize);

    }
    //closed the output stream
    fileOutput.close();

我已经搜索了很多，但没有得到结果。如果可能的话，请尽量详细地解释我的错误，因为我是第一次实现这个东西。

我尝试获取直接的PDF链接，例如：http://labs.google.com/papers/bigtable-osdi06.pdf，它们可以轻松下载，而且它们的“Content-Type”也是“application/pdf”。

谢谢。

- abhy

你检查过服务器响应的 MIME 类型了吗？ - Al Sutton

2个回答

1

理论1：服务器在响应中返回了不正确的内容类型。如果服务器代码是由您编写和部署的，请检查一下。

理论2：该网址返回一个包含一些JavaScript的HTML页面，该JavaScript将页面重定向到实际PDF文件的网址。

- Nishan

我正在尝试打开的URL具有一些内联PDF渲染，其中它显示嵌入网页中的PDF文件。你认为这可能是个问题吗？因为当我使用Firefox浏览器时，它会在Web页面中呈现它，但是当我在Chrome浏览器中打开此链接时，它会下载该文件。那么，是否有任何方法可以直接获取PDF作为二进制文件，而不是接收“html / text”，或者需要在服务器端进行修改。我还没有部署服务器代码。 - abhy

@al-sutton @nishan 我已经通过FireBug进行了检查，它显示为application/pdf对象。那么，我需要做一些更改才能访问网页中嵌入的PDF吗？ - abhy

此外，我能够下载 PDF 的确切文件大小，但是以 'text/html' 形式而不是 'application/pdf' 形式接收它，因此显示“无法打开 text/html 文件类型”。 - abhy

嗨，问题在于我没有发送cookie，因此身份验证失败，从而导致下载html页面。我将Nishan的答案设置为接受的答案，因为它让我看到了我可能犯下的所有错误，并最终知道了我的错误之处。谢谢大家的回复。 - abhy

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Predders · Accepted Answer

这个帖子让我找到了解决问题的方法！当你尝试从WebView下载流式PDF并且使用HttpURLConnection时，你需要同时传递来自Webview内部的cookies。

String cookie = CookieManager.getInstance().getCookie(url.toString());
if (cookie != null) connection.setRequestProperty("cookie", cookie);