Vim语法:仅在其他匹配项之间匹配时才进行匹配

5

我正在尝试为我的日志文件创建语法文件。它们的格式如下:

[time] LEVEL filepath:line - message

我的语法文件看起来像这样:

:syn region logTime start=+^\[+ end=+\] +me=e-1
:syn keyword logCritical CRITICAL skipwhite nextgroup=logFile
:syn keyword logError ERROR skipwhite nextgroup=logFile
:syn keyword logWarn WARN skipwhite nextgroup=logFile
:syn keyword logInfo INFO skipwhite nextgroup=logFile
:syn keyword logDebug DEBUG skipwhite nextgroup=logFile
:syn match logFile " \S\+:" contained nextgroup=logLineNumber
:syn match logLineNumber "\d\+" contained

我遇到的问题是,如果消息中包含字符串ERRORDEBUG等,它会被突出显示。但我不想这样。我希望只有当关键字紧随时间并紧接在文件路径之前时,才突出显示关键字。
如何做到这一点?
1个回答

5
使用一个长这样的测试文件:
[01:23:45] ERROR /foo/bar:42 - this is a log message
[01:23:45] ERROR /foo/bar:42 - this is a ERROR log message
[01:23:45] CRITICAL /foo/bar:42 - this is a log message
[01:23:45] CRITICAL /foo/bar:42 - this is a CRITICAL log message

这个语法文件对我有效,不会在消息部分突出显示这些关键字。
" Match the beginning of a log entry. This match is a superset which
" contains other matches (those named in the "contains") parameter.
"
"     ^                   Beginning of line
"     \[                  Opening square bracket of timestamp
"         [^\[\]]\+       A class that matches anything that isn't '[' or ']'
"                             Inside a class, ^ means "not"
"                             So this matches 1 or more non-bracket characters
"                             (in other words, the timestamp itself)
"                             The \+ following the class means "1 or more of these"
"     \]                  Closing square bracket of timestamp
"     \s\+                Whitespace character (1 or more)
"     [A-Z]\+             Uppercase letter (1 or more)
"
" So, this matches the timestamp and the entry type (ERROR, CRITICAL...)
"
syn match logBeginning "^\[[^\[\]]\+\]\s\+[A-Z]\+" contains=logTime,logCritical,logError,logWarn,logInfo,logDebug

" A region that will match the timestamp. It starts with a bracket and
" ends with a bracket. "contained" means that it is expected to be contained
" inside another match (and above, logBeginning notes that it contains logTime).
" The "me" parameter e-1 means that the syntax match will be offset by 1 character
" at the end. This is usually done when the highlighting goes a character too far.
syn region logTime start=+^\[+ end=+\] +me=e-1 contained

" A list of keywords that define which types we expect (ERROR, WARN, etc.)
" These are all marked contained because they are a subset of the first
" match rule, logBeginning.
syn keyword logCritical CRITICAL contained
syn keyword logError ERROR contained
syn keyword logWarn WARN contained
syn keyword logInfo INFO contained
syn keyword logDebug DEBUG contained

" Now that we have taken care of the timestamp and log type we move on
" to the filename and the line number. This match will catch both of them.
"
" \S\+         NOT whitespace (1 or more) - matches the filename
" :            Matches a literal colon character
" \d\+         Digit (1 or more) - matches the line number
syn match logFileAndNumber " \S\+:\d\+" contains=logFile,logLineNumber

" This will match only the log filename so we can highlight it differently
" than the line number.
syn match logFile " \S\+:" contained

" Match only the line number.
syn match logLineNumber "\d\+" contained

vim高亮的截图

你可能会好奇为什么我使用了包含匹配而非各种匹配。因为有些匹配,比如\d\+太泛泛而谈,无法匹配行中的任何位置并且正确 - 使用包含匹配它们可以被分组成更可能正确的模式。在此语法文件的早期版本中,一些示例行是错误的,因为例如,如果“ERROR”出现在日志条目文本中的后面,则会突出显示。但在此定义中,只有在时间戳旁边出现时,这些关键字才会匹配,该时间戳仅出现在第一行。因此,容器是一种更精确匹配但也能控制正则表达式长度和复杂性的方法。

更新:根据您提供的示例行(如下所述),我已改进了上面第一行的正则表达式,并在我的测试中,它现在可以正常工作。

[2015-10-05 13:02:27,619] ERROR /home/admusr/autobot/WebManager/wm/operators.py:2371 - Failed to fix py rpc info: [Errno 2] No such file or directory: '/opt/.djangoserverinfo'
[2015-10-05 13:02:13,147] ERROR /home/admusr/autobot/WebManager/wm/operators.py:3223 - Failed to get field "{'_labkeys': ['NTP Server'], 'varname': 'NTP Server', 'displaygroup': 'Lab Info'}" value from lab info: [Errno 111] Connection refused
[2015-10-05 13:02:38,012] ERROR /home/admusr/autobot/WebManager/wm/operators.py:3838 - Failed to add py rpc info: [Errno 2] No such file or directory: '/opt/.djangoserverinfo'
[2015-10-05 12:39:22,835] DEBUG /home/admusr/autobot/WebManager/wm/operators.py:749 - no last results get: [Errno 2] No such file or directory: u'/home/admusr/autobot/admin/branches/Wireless_12.2.0_ewortzman/.lastresults'

这个几乎完美地工作,但有一些行在文件和行号上没有高亮显示。这里有一些这样的行。时间和级别高亮显示得很好。 - ewok
我注意到的是,这些行都有一个紧接着闭括号的数字。我没有看到任何高亮显示正常的这样的行。 - ewok
我更新了第一行的正则表达式。你能试试看吗?使用你在pastebin中提供的示例行,它对我有效。 - Dan Lowe
1
@ewok 我更新了我的答案,包括了语法匹配的解释。我现在没有时间去处理你的新格式,但是希望这个解释能帮助你解决问题。 - Dan Lowe
1
@ewok 我再次更新了答案,并加入了一些评论,解释为什么我使用包含匹配来将某些内容分组在一起。 - Dan Lowe
显示剩余3条评论

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接