在 Pandoc 的 Lua 过滤器中连接字符串片段

Question

在 Pandoc 的 Lua 过滤器中连接字符串片段

3

我正在尝试创建一个pandoc过滤器，以帮助我总结数据。我已经看到一些过滤器可以创建目录，但我想根据标题中的内容来组织索引。

例如，在下面的示例中，我想根据标题中标记的日期提供内容摘要（某些标题可能不包含日期...）

[nwatkins@sapporo foo]$ cat test.md
# 1 May 2018
some info

# not a date
some data

# 2 May 2018
some more info

我开始试图查看头文件的内容。意图是应用简单的正则表达式来匹配不同的日期/时间模式。

[nwatkins@sapporo foo]$ cat test.lua
function Header(el)
  return pandoc.walk_block(el, {
    Str = function(el)
      print(el.text)
    end })
end

很不幸，似乎这适用于每个由空格分隔的字符串的打印状态，而不是允许我分析整个标题内容的连接。

[nwatkins@sapporo foo]$ pandoc --lua-filter test.lua test.md
1
May
2018
not
...

在过滤器中是否有规范的方法来做这件事？我还没有在Lua过滤器文档中看到任何帮助函数。

- Noah Watkins

你需要匹配 Header，而不是 Str。更多信息请参见 https://pandoc.org/lua-filters.html... - mb21

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- tarleb · Accepted Answer

更新: 开发版本现已提供新功能pandoc.utils.stringify和pandoc.utils.normalize_date。它们将成为下一个pandoc版本（可能是2.0.6）的一部分。使用这些函数，您可以使用以下代码测试标题中是否包含日期：

function Header (el)
  content_str = pandoc.utils.stringify(el.content)
  if pandoc.utils.normalize_date(content_str) ~= nil then
    print 'header contains a date'
  else
    print 'not a date'
  end
end

目前还没有辅助函数，但我们计划在不久的将来提供一个pandoc.utils.tostring 函数。

与此同时，以下片段（取自这个讨论）应该能帮助您获得所需内容：

--- convert a list of Inline elements to a string.
function inlines_tostring (inlines)
  local strs = {}
  for i = 1, #inlines do
    strs[i] = tostring(inlines[i])
  end
  return table.concat(strs)
end

-- Add a `__tostring` method to all Inline elements. Linebreaks
-- are converted to spaces.
for k, v in pairs(pandoc.Inline.constructor) do
  v.__tostring = function (inln)
    return ((inln.content and inlines_tostring(inln.content))
        or (inln.caption and inlines_tostring(inln.caption))
        or (inln.text and inln.text)
        or " ")
  end
end

function Header (el)
  header_text = inlines_tostring(el.content)
end