使用base64文件源而非url来渲染pdf文件的Pdf.js

83
我试图使用pdf.js从PDF中呈现页面。
通常,使用URL,我可以做到这一点:
PDFJS.getDocument("http://www.server.com/file.pdf").then(function getPdfHelloWorld(pdf) {
  //
  // Fetch the first page
  //
  pdf.getPage(1).then(function getPageHelloWorld(page) {
    var scale = 1.5;
    var viewport = page.getViewport(scale);

    //
    // Prepare canvas using PDF page dimensions
    //
    var canvas = document.getElementById('the-canvas');
    var context = canvas.getContext('2d');
    canvas.height = viewport.height;
    canvas.width = viewport.width;

    //
    // Render PDF page into canvas context
    //
    page.render({canvasContext: context, viewport: viewport});
  });
});

但在这种情况下,我拥有的是base64格式的文件,而不是一个URL:

data:application/pdf;base64,JVBERi0xLjUKJdDUxdgKNSAwIG9iaiA8PAovTGVuZ3RoIDE2NjUgICAgICAKL0ZpbHRlciAvRmxhdGVEZWNvZGUKPj4Kc3RyZWFtCnjarVhLc9s2...

如何实现这个?

3个回答

113

http://mozilla.github.com/pdf.js/build/pdf.js获取的源代码。

/**
 * This is the main entry point for loading a PDF and interacting with it.
 * NOTE: If a URL is used to fetch the PDF data a standard XMLHttpRequest(XHR)
 * is used, which means it must follow the same origin rules that any XHR does
 * e.g. No cross domain requests without CORS.
 *
 * @param {string|TypedAray|object} source Can be an url to where a PDF is
 * located, a typed array (Uint8Array) already populated with data or
 * and parameter object with the following possible fields:
 *  - url   - The URL of the PDF.
 *  - data  - A typed array with PDF data.
 *  - httpHeaders - Basic authentication headers.
 *  - password - For decrypting password-protected PDFs.
 *
 * @return {Promise} A promise that is resolved with {PDFDocumentProxy} object.
 */

标准的XMLHttpRequest(XHR)用于检索文档。但问题在于XMLHttpRequest不支持data:URI(例如data:application / pdf; base64,JVBERi0xLjUK ...)。

但是有可能将一个已分类的JavaScript数组传递给该函数。你需要做的唯一事情就是将base64字符串转换为Uint8Array。您可以使用在https://gist.github.com/1032746中找到的此函数。

var BASE64_MARKER = ';base64,';

function convertDataURIToBinary(dataURI) {
  var base64Index = dataURI.indexOf(BASE64_MARKER) + BASE64_MARKER.length;
  var base64 = dataURI.substring(base64Index);
  var raw = window.atob(base64);
  var rawLength = raw.length;
  var array = new Uint8Array(new ArrayBuffer(rawLength));

  for(var i = 0; i < rawLength; i++) {
    array[i] = raw.charCodeAt(i);
  }
  return array;
}

简述:

var pdfAsDataUri = "data:application/pdf;base64,JVBERi0xLjUK..."; // shortened
var pdfAsArray = convertDataURIToBinary(pdfAsDataUri);
PDFJS.getDocument(pdfAsArray)

1
能否使用pdf.js获取pdf的二进制并在pdf查看器中显示? - Dakait
1
干得漂亮。但是,如果源是通过RESTful调用检索到的PDF,并转换为arraybuffer或blob呢?我在这里发布了一个问题:https://dev59.com/GGAf5IYBdhLWcg3wyVF1 - witttness
2
如果你在“严格模式”下,请不要忘记在“i = 0”之前加上“var”!我为此浪费了1个小时。 :) - kube
我已经解决了这个问题。我的答案在这里 - toddmo
我喜欢数组 :-) var base64 = dataURI.split(BASE64_MARKER)[1]; - ddlab

5
根据示例,直接支持base64编码,虽然我自己没有测试过。获取您的base64字符串(从文件中派生或使用任何其他方法,如POST / GET,Websockets等),使用atob将其转换为二进制,然后解析此二进制并在PDFJS API上调用getDocument,例如PDFJS.getDocument({data: base64PdfData});。Codetoffel的回答对我也很有效。

1
我已经使用nodejs包进行了测试,使用PDFJS.getDocument({data: Buffer.from(pdf_base64, 'base64')}) - efeder

0

Used the Accepted Answer to do a check for IE and convert the dataURI to UInt8Array; an accepted form by PDFJS

        Ext.isIE ? pdfAsDataUri = me.convertDataURIToBinary(pdfAsDataUri): '';

        convertDataURIToBinary: function(dataURI) {
          var BASE64_MARKER = ';base64,',
            base64Index = dataURI.indexOf(BASE64_MARKER) + BASE64_MARKER.length,
            base64 = dataURI.substring(base64Index),
            raw = window.atob(base64),
            rawLength = raw.length,
            array = new Uint8Array(new ArrayBuffer(rawLength));

          for (var i = 0; i < rawLength; i++) {
            array[i] = raw.charCodeAt(i);
          }
          return array;
        },


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接