使用iText将带有图片的HTML转换为PDF

6
我已经搜索了问题,但没有找到解决我具体问题的方法。我的需求是将包含图像和CSS样式的HTML文件转换为PDF。我正在使用iText 5,并已成功将样式包含在生成的PDF中。然而,我仍然无法包含图片。我在下面附上了我的代码。绝对路径的图片已被包含在生成的PDF中,而相对路径的图片则未被包含。我知道我需要实现AbstractImageProvider,但不知道如何实现。非常感谢您的帮助。
Java文件:
public class Converter {

    static String in = "C:/Users/APPS/Desktop/Test_Html/index.htm";
    static String out = "C:/Users/APPS/Desktop/index.pdf";
    static String css = "C:/Users/APPS/Desktop/Test_Html/style.css";

    public static void main(String[] args) {
        try {
            convertHtmlToPdf();
        } catch (DocumentException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

    private static void convertHtmlToPdf() throws DocumentException, IOException {
        Document document = new Document();
        PdfWriter pdfWriter = PdfWriter.getInstance(document, new FileOutputStream(out));
        document.open();
        XMLWorkerHelper.getInstance().parseXHtml(pdfWriter, document, new FileInputStream(in), new FileInputStream(css));
        document.close();
        System.out.println("PDF Created!");
    }

    /**
     * Not sure how to implement this
     * @author APPS
     *
     */
    public class myImageProvider extends AbstractImageProvider {

        @Override
        public String getImageRootPath() {
            // TODO Auto-generated method stub
            return null;
        }

    }

}

HTML文件:

<!DOCTYPE html>
<html lang="en">

<head>
    <title>HTML to PDF</title>
    <link href="style.css" rel="stylesheet" type="text/css" />
</head>

<body>
    <h1>HTML to PDF</h1>
    <p>
        <span class="itext">itext</span> 5.4.2
        <span class="description"> converting HTML to PDF</span>
    </p>
    <table>
        <tr>
            <th class="label">Title</th>
            <td>iText - Java HTML to PDF</td>
        </tr>
        <tr>
            <th>URL</th>
            <td>http://wwww.someurl.com</td>
        </tr>
    </table>
    <div class="center">
        <h2>Here is an image</h2>
        <div>
            <img src="images/Vader_TFU.jpg" />
        </div>
        <div>
            <img src="https://www.w3schools.com/images/picture.jpg" alt="Mountain" />
        </div>
    </div>
</body>
</html>

CSS文件:

h1 {
    color: #ccc;
}

table tr td {
    text-align: center;
    border: 1px solid gray;
    padding: 4px;
}

table tr th {
    background-color: #84C7FD;
    color: #fff;
    width: 100px;
}

.itext {
    color: #84C7FD;
    font-weight: bold;
}

.description {
    color: gray;
}

.center {
    text-align: center;
}
1个回答

18
以下内容基于iText5 5.5.12版本。
假设您有以下的目录结构:
enter image description here 使用此代码和最新的iText5:
package converthtmltopdf;

import com.itextpdf.text.Document;
import com.itextpdf.text.DocumentException;
import com.itextpdf.text.pdf.PdfWriter;
import com.itextpdf.tool.xml.XMLWorker;
import com.itextpdf.tool.xml.XMLWorkerHelper;
import com.itextpdf.tool.xml.html.Tags;
import com.itextpdf.tool.xml.net.FileRetrieve;
import com.itextpdf.tool.xml.net.FileRetrieveImpl;
import com.itextpdf.tool.xml.parser.XMLParser;
import com.itextpdf.tool.xml.pipeline.css.CSSResolver;
import com.itextpdf.tool.xml.pipeline.css.CssResolverPipeline;
import com.itextpdf.tool.xml.pipeline.end.PdfWriterPipeline;
import com.itextpdf.tool.xml.pipeline.html.AbstractImageProvider;
import com.itextpdf.tool.xml.pipeline.html.HtmlPipeline;
import com.itextpdf.tool.xml.pipeline.html.HtmlPipelineContext;
import com.itextpdf.tool.xml.pipeline.html.LinkProvider;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;

/**
 *
 * @author george.mavrommatis
 */
public class ConvertHtmlToPdf {
    public static final String HTML = "C:\\Users\\zzz\\Desktop\\itext\\index.html";
    public static final String DEST = "C:\\Users\\zzz\\Desktop\\itext\\index.pdf";
    public static final String IMG_PATH = "C:\\Users\\zzz\\Desktop\\itext\\";
    public static final String RELATIVE_PATH = "C:\\Users\\zzz\\Desktop\\itext\\";
    public static final String CSS_DIR = "C:\\Users\\zzz\\Desktop\\itext\\";

    /**
     * Creates a PDF with the words "Hello World"
     * @param file
     * @throws IOException
     * @throws DocumentException
     */
    public void createPdf(String file) throws IOException, DocumentException {
        // step 1
        Document document = new Document();
        // step 2
        PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(file));
        // step 3
        document.open();
        // step 4

        // CSS
        CSSResolver cssResolver =
                XMLWorkerHelper.getInstance().getDefaultCssResolver(false);
        FileRetrieve retrieve = new FileRetrieveImpl(CSS_DIR);
        cssResolver.setFileRetrieve(retrieve);

        // HTML
        HtmlPipelineContext htmlContext = new HtmlPipelineContext(null);
        htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory());
        htmlContext.setImageProvider(new AbstractImageProvider() {
            public String getImageRootPath() {
                return IMG_PATH;
            }
        });
        htmlContext.setLinkProvider(new LinkProvider() {
            public String getLinkRoot() {
                return RELATIVE_PATH;
            }
        });

        // Pipelines
        PdfWriterPipeline pdf = new PdfWriterPipeline(document, writer);
        HtmlPipeline html = new HtmlPipeline(htmlContext, pdf);
        CssResolverPipeline css = new CssResolverPipeline(cssResolver, html);

        // XML Worker
        XMLWorker worker = new XMLWorker(css, true);
        XMLParser p = new XMLParser(worker);
        p.parse(new FileInputStream(HTML));

        // step 5
        document.close();
    }
    /**
     * @param args the command line arguments
     */
    public static void main(String[] args) throws IOException, DocumentException {
        // TODO code application logic here
        new ConvertHtmlToPdf().createPdf(DEST);
    }

}

这是最终结果:

输入图像描述

此示例使用了以下代码:https://developers.itextpdf.com/examples/xml-worker-itext5/xml-worker-examples

希望这可以帮到您。


@jdubicki,很高兴能够帮助你。不要忘记,如果一个答案对你有帮助,你可以点赞并接受它。谢谢。 - MaVRoSCy
我遇到了另一个需要帮助的问题。为了能够读取和呈现嵌套的无序列表,我不得不进行一些修改。我注意到我的标题标签在 PDF 中没有被解析,有人可以帮我纠正吗?以下是我需要进行的更改。 - jdubicki
PdfWriter.getInstance(document, new FileOutputStream(file));// 管道 ElementList elements = new ElementList(); ElementHandlerPipeline end = new ElementHandlerPipeline(elements, null); HtmlPipeline html = new HtmlPipeline(htmlPipelineContext, end); CssResolverPipeline css = new CssResolverPipeline(cssResolver, html);document.open(); for (Element e : elements) { document.add(e); } document.add(Chunk.NEWLINE); - jdubicki
我发布了一个新的问题。https://stackoverflow.com/questions/46980365/itext-not-rendering-html-header-tags-properly-when-converting-to-pdf - jdubicki

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接