如何在Hakyll中使用Pandoc过滤器？

Question

如何在Hakyll中使用Pandoc过滤器？

haskellfilterpandochakyll

5

很抱歉问这样的问题。但我真的很陌生 Haskell。我在互联网上搜了一整天，但没有找到任何例子。

我有一个用 Python 编写的 pandoc 过滤器（tikzcd.py）。我想使用该过滤器处理我的博客文章。

我猜我需要使用 unixFilter 或 pandocCompileWithTransform，但我对 Haskell 的认识确实不足以自行找到解决方案。

所以，能否有人提供一个例子？

-----------更新---------------

~~@Michael 使用 pandocCompileWithTransformM 和 unixFilter 给出了一个解决方案。它有效，但存在一个问题。~~

~~当从命令行使用筛选器时，我将执行以下操作：~~

pandoc -t json -READEROPTIONS input.markdown | ./filter.py | pandoc -f JSON -WRITEROPTIONS -o output.html

~~or equivalently~~
pandoc --filter ./filter.py -READEROPTIONS -WRITEROPTIONS -o html
~~This command is shorter but it doesn't show the procedures.~~

~~但是使用 pandocCompilerTransformM，它会执行类似以下操作：~~

pandoc -t html -READEROPTIONS -WRITEROPTIONS input.mardown | pandoc -t JSON | ./filter.py | pandoc -f JSON -WRITEROPTIONS -o output.html

区别在于传递给filter.py的文本不同：一个是直接从Markdown生成的内容，而另一个是从Markdown生成的HTML中产生的一些文本。如您所知，来回转换总会产生意外问题。因此，我认为可能有更好的解决方案。

附注：我已经开始学习Haskell。希望有一天我能自己解决这个问题。谢谢！

- Fang Hung-chien

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Michael · Accepted Answer

最终，我认为你会同时使用两者。使用这个https://github.com/listx/listx_blog/blob/master/blog.hs作为模板，以下内容将与transformer在其中的形状相同。 transformer用于'posts'的69-80行 -- 即作为pandocCompilerWithTransformM的第三个参数，它是一个(Pandoc -> Compiler Pandoc)。在此，您需要添加到您的Python过滤器的绝对路径 -- 或者如果在$PATH中，则添加名称 -- 以及读取器和写入器选项（例如defaultHakyllReaderOptions和defaultHakyllWriterOptions）。

import Text.Pandoc
import Hakyll

type Script = String 

transformer
  :: Script         -- e.g. "/absolute/path/filter.py"
  -> ReaderOptions  -- e.g.  defaultHakyllReaderOptions
  -> WriterOptions  -- e.g.  defaultHakyllWriterOptions
  -> (Pandoc -> Compiler Pandoc)
transformer script reader_opts writer_opts pandoc = 
    do let input_json = writeJSON writer_opts pandoc
       output_json <- unixFilter script [] input_json
       return $ 
          -- either (error.show) id $  -- this line needs to be uncommented atm.
          readJSON reader_opts output_json

同样地，您可以使用(transformer "/usr/local/bin/myfilter.py" defaultHakyllReaderOptions defaultHakyllWriterOptions)来代替第125行的(return . pandocTransform)，在这个示例代码片段中。为了调试，您可以将所有内容外包给unixFilter：

transform :: Script -> String -> Compiler String
transform script md = do json0 <- unixFilter pandoc input_args md
                         json1 <- unixFilter script [] json0
                         unixFilter pandoc output_args json1
 where
   pandoc = "pandoc"
   input_args = words "-f markdown -t json" -- add others
   output_args = words "-f json -t html"    -- add others

do块的三行相当于unix管道中的阶段：pandoc -t json | filter.py | pandoc -f json，可以加上其他参数。

我认为也许你是对的，在这里有一个额外的pandoc来回转换。pandocCompilerWithTransform(M)函数是用于Pandoc-> Pandoc直接函数的 - 它将被应用于Hakyll使用的Pandoc。我认为我们应该放弃这个并直接使用Pandoc库。使用unixCompile的方法可能是这样的。

transformXLVI :: Script -> ReaderOptions -> WriterOptions -> String  -> Compiler Html
transformXLVI script ropts wopts = fmap fromJSON . unixFilter script [] . toJSON 
  where 
    toJSON   = writeJSON wopts 
    --           . either (error . show) id -- for pandoc > 1.14
               . readMarkdown ropts 
    fromJSON = writeHtml wopts
    --           . either (error . show) id
               . readJSON ropts

我希望这些变化能够呈现出原则！这应该与之前的transform基本相同；我们使用的是pandoc库，而不是调用pandoc可执行文件。