有一种非常“hacky”的方法可以提取数据,但它只适用于旧版本的Ghostscript,如8.51或8.62。在旧版本的Ghostscript中,PDF命令定义在/lib/pdf_ops.ps中。新版本则做了其他事情。
这里提供了经过测试的8.62版本。
http://sourceforge.net/projects/ghostscript/files/GPL%20Ghostscript/8.62/gs862w32.exe/download
你需要的文本是使用/Tj {} def
和/TJ {} def
打印的,通过在每个定义的开头添加dup ==
来实现。(这可以更加复杂)我也没有担心字体警告消息,但如果数据写入文件,则会被过滤掉。
一些单词被分成几个部分和单个字母,因为正在进行字距调整。随着时间的推移,这也可以被过滤。
从pdf_ops.ps修改/Tj
/Tj { dup ==
0 0 moveto Show settextposition
} bdef
从pdf_ops.ps修改/TJ
/TJ { dup ==
0 0 moveto {
dup type /stringtype eq {
Show
} { -1000 div
currentfont /ScaleMatrix .knownget { 0 get mul } if
0 Vexch rmoveto
} ifelse
} forall settextposition
} bdef
输出
(Help a neighbor within your county each month by contributing to The Salvation )
(Army's Project SHARE and Georgia Power will match your gift. To help, simply check )
($1, $2, $5, or $10 on the return portion of this bill. Starting next month, your pledge )
(amount will be included on your monthly bill.)
(Our business offices will be closed on December 24 and 25 for Christmas and January )
(1 for New Year's Day. In case of an emergency, please call us at the number on your )
(bill 24 hours a day, 7 days a week.)
(PLEASE KEEP THIS PORTION FOR YOUR RECORDS.)
(PLEASE RETURN THIS PORTION WITH YOUR PAYMENT, MAKING SURE THE RETURN ADDRESS SHOWS IN THE ENVELOPE WINDOW.)
(Account Number)
(Mail To:)
PostScript 不是很有趣吗?