如何给PDF文件添加页码?

4
我整个早上都试图给一个pdf文档添加页码,但我实在想不出该怎么做。我希望能使用Python,并结合pyPdf或reportlab来实现这一目标。请问是否有任何想法?
2个回答

5

这是我用Python写的向PDF文件添加页码的代码。 我同时使用了pyPdf2和reportlab。

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

helpDoc = '''
Add Page Number to PDF file with Python 
Python 给 PDF 添加 页码
usage:
    python addPageNumberToPDF.py [PDF path] 
require:
    pip install reportlab pypdf2
    Support both Python2/3, But more recommend Python3

tips:
    * output file will save at pdfWithNumbers/[PDF path]_page.pdf 
    * only support A4 size PDF
    * tested on Python2/Python3@ubuntu
    * more large size of PDF require more RAM 
    * if segmentation fault, plaese try use Python 3
    * if generate PDF document is damaged, plaese try use Python 3

Author:
    Lei Yang (ylxx@live.com)

GitHub: 
    https://gist.github.com/DIYer22/b9ede6b5b96109788a47973649645c1f
'''
print(helpDoc)

import reportlab
from reportlab.lib.units import mm
from reportlab.pdfgen import canvas

from PyPDF2 import PdfFileWriter, PdfFileReader

def createPagePdf(num, tmp):
    c = canvas.Canvas(tmp)
    for i in range(1,num+1): 
        c.drawString((210//2)*mm, (4)*mm, str(i))
        c.showPage()
    c.save()
    return 
    with open(tmp, 'rb') as f:
        pdf = PdfFileReader(f)
        layer = pdf.getPage(0)
    return layer


if __name__ == "__main__":
    pass
    import sys,os
    path = 'MLDS17f.pdf'
#    path = '1.pdf'
    if len(sys.argv) == 1:
        if not os.path.isfile(path):
            sys.exit(1)
    else:
        path = sys.argv[1]
    base = os.path.basename(path)


    tmp = "__tmp.pdf"

    batch = 10
    batch = 0
    output = PdfFileWriter()
    with open(path, 'rb') as f:
        pdf = PdfFileReader(f,strict=False)
        n = pdf.getNumPages()
        if batch == 0:
            batch = -n
        createPagePdf(n,tmp)
        if not os.path.isdir('pdfWithNumbers/'):
            os.mkdir('pdfWithNumbers/')
        with open(tmp, 'rb') as ftmp:
            numberPdf = PdfFileReader(ftmp)
            for p in range(n):
                if not p%batch and p:
                    newpath = path.replace(base, 'pdfWithNumbers/'+ base[:-4] + '_page_%d'%(p//batch) + path[-4:])
                    with open(newpath, 'wb') as f:
                        output.write(f)
                    output = PdfFileWriter()
#                sys.stdout.write('\rpage: %d of %d'%(p, n))
                print('page: %d of %d'%(p, n))
                page = pdf.getPage(p)
                numberLayer = numberPdf.getPage(p)

                page.mergePage(numberLayer)
                output.addPage(page)
            if output.getNumPages():
                newpath = path.replace(base, 'pdfWithNumbers/' + base[:-4] + '_page_%d'%(p//batch + 1)  + path[-4:])
                with open(newpath, 'wb') as f:
                    output.write(f)

        os.remove(tmp)

请在我的GitHub gist中检查最新版本


1
你想做的与给PDF加水印非常相似,只是你要在每一页上放置不同的水印。 pdfrw(免责声明:我是作者。)将执行水印功能(并提供水印示例)。你可以使用reportlab来编程创建一个PDF,其中仅包含您想要的页码 - 目标文档中的每一页都有一个页面,然后使用pdfrw在原始文档的顶部覆盖该文档的每一页。当使用pdfrw时,您可能希望重用原始PDF概述以保留书签等内容。如果您查看pdfrw水印示例,它将向您展示如何做到这一点。
由于这些都是python,因此您可以从同一程序中使用它们,并(例如)使用pdfrw来确定需要从reportlab生成多少页。

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接