如何通过函数或自定义函数在Excel中获得正则表达式支持?

28

看起来 Excel 不支持正则表达式,除非通过 VBA。是否如此,如果是,有没有任何“开源”的自定义 VBA 函数支持正则表达式?在这种情况下,我要从字符串中提取复杂的模式,任何在函数本身内部公开支持正则表达式的自定义 VBA 函数的实现都将很有用。如果您知道类似 IS 函数的半相关函数,请随意评论,但我真的在寻找完全公开了正则表达式实现的函数。

另外,提醒一下,我正在 Windows 7 上使用 Office 2010;在一个似乎是一个很好的建议后,添加了这些信息,结果发现在 Office 2010 上不起作用。


另一个用户(@user500414)发布了http://ramblings.mcpher.com/Home/excelquirks/regular-expressions作为答案,但它被删除为评论,这是一个不错的实现,具有通用正则表达式VBA接口的完整代码。 - Orbling
还有一个用C++编写的Add-In,因此需要安装.xll文件,但是代码可用:http://xllregex.codeplex.com/ - Orbling
6个回答

40

Excel中没有内置的功能。VBScript有内置支持,并且可以从VBA调用。更多信息在此处提供。您可以在VBA中使用后期绑定调用对象。我最近编写了一些函数,这应该至少能让您入门。请注意,这些函数未经充分测试可能存在一些错误,但它们相当简单。

这应该至少能让您开始:

'---------------------------------------------------------------------------------------vv
' Procedure : RegEx
' Author    : Mike
' Date      : 9/1/2010
' Purpose   : Perform a regular expression search on a string and return the first match
'               or the null string if no matches are found.
' Usage     : If Len(RegEx("\d{1,2}[/-]\d{1,2}[/-]\d{2,4}", txt)) = 0 Then MsgBox "No date in " & txt
'           : TheDate = RegEx("\d{1,2}[/-]\d{1,2}[/-]\d{2,4}", txt)
'           : CUSIP = Regex("[A-Za-z0-9]{8}[0-9]",txt)
'---------------------------------------------------------------------------------------
'^^
Function RegEx(Pattern As String, TextToSearch As String) As String 'vv
    Dim RE As Object, REMatches As Object

    Set RE = CreateObject("vbscript.regexp")
    With RE
        .MultiLine = False
        .Global = False
        .IgnoreCase = False
        .Pattern = Pattern
    End With

    Set REMatches = RE.Execute(TextToSearch)
    If REMatches.Count > 0 Then
        RegEx = REMatches(0)
    Else
        RegEx = vbNullString
    End If
End Function '^^

'---------------------------------------------------------------------------------------
' Procedure : RegExReplace
' Author    : Mike
' Date      : 11/4/2010
' Purpose   : Attempts to replace text in the TextToSearch with text and back references
'               from the ReplacePattern for any matches found using SearchPattern.
' Notes     - If no matches are found, TextToSearch is returned unaltered.  To get
'               specific info from a string, use RegExExtract instead.
' Usage     : ?RegExReplace("(.*)(\d{3})[\)\s.-](\d{3})[\s.-](\d{4})(.*)", "My phone # is 570.555.1234.", "$1($2)$3-$4$5")
'             My phone # is (570)555-1234.
'---------------------------------------------------------------------------------------
'
Function RegExReplace(SearchPattern As String, TextToSearch As String, ReplacePattern As String, _
                      Optional GlobalReplace As Boolean = True, _
                      Optional IgnoreCase As Boolean = False, _
                      Optional MultiLine As Boolean = False) As String
Dim RE As Object

    Set RE = CreateObject("vbscript.regexp")
    With RE
        .MultiLine = MultiLine
        .Global = GlobalReplace
        .IgnoreCase = IgnoreCase
        .Pattern = SearchPattern
    End With

    RegExReplace = RE.Replace(TextToSearch, ReplacePattern)
End Function

'---------------------------------------------------------------------------------------
' Procedure : RegExExtract
' Author    : Mike
' Date      : 11/4/2010
' Purpose   : Extracts specific information from a string.  Returns empty string if not found.
' Usage     : ?RegExExtract("(.*)(\d{3})[\)\s.-](\d{3})[\s.-](\d{4})(.*)", "My phone # is 570.555.1234.", "$2$3$4")
'             5705551234
'             ?RegExExtract("(.*)(\d{3})[\)\s.-](\d{3})[\s.-](\d{4})(.*)", "My name is Mike.", "$2$3$4")
'
'             ?RegExReplace("(.*)(\d{3})[\)\s.-](\d{3})[\s.-](\d{4})(.*)", "My name is Mike.", "$2$3$4")
'             My name is Mike.
'---------------------------------------------------------------------------------------
'
Function RegExExtract(SearchPattern As String, TextToSearch As String, PatternToExtract As String, _
                      Optional GlobalReplace As Boolean = True, _
                      Optional IgnoreCase As Boolean = False, _
                      Optional MultiLine As Boolean = False) As String
Dim MatchFound As Boolean

    MatchFound = Len(RegEx(SearchPattern, TextToSearch)) > 0
    If MatchFound Then
        RegExExtract = RegExReplace(SearchPattern, TextToSearch, PatternToExtract, _
                                    GlobalReplace, IgnoreCase, MultiLine)
    Else
        RegExExtract = vbNullString
    End If
End Function

+1 很棒,谢谢 —— 在我尝试使用这段代码之前,有一个简短的问题(看起来很好,谢谢)—— 你在什么系统上使用了它?我用的是 Windows 7,Office 2010,64 位。再次感谢! - blunders
1
Windows 7 64位,Office XP。vbscript.regexp对象随Windows一起提供,而不是Office,因此它应该可以很好地为您工作。 - mwolfe02
这个正则表达式引擎似乎随Internet Explorer的某些版本一起发布。据我所知,我已经在我的排序插件LAselect中使用它,并且它似乎可以很好地用于对街道名称、人名、邮政编码等进行排序,而无需分割单元格。+1 - Avan
1
您还可以使用早期绑定调用它,只需添加对VBScript正则表达式库的引用即可。 - Zev Spitz
+1 是为了 CreateObject("vbscript.regexp"),这样就不需要手动添加支持了,通过 工具 -> 引用 -> Microsoft VBScript Regular Expressions 5.5 - Andrew

9

1
+1 谢谢,看起来只兼容 Excel 95 到 2007 版本——我正在使用 Win7 上的 2010 版本;本以为任何解决方案都可以在我的平台上运行,但看来不是这样。我会在我的问题中添加这些信息。 - blunders
2
只是一个更新,问题似乎与Office2010-Win7上的64位有关 http://www.mrexcel.com/forum/showthread.php?t=467397 目前真的不想仅因为这个问题而切换到32位Office 2010,但我认为我应该把这些信息添加到您的答案中,以防其他人正在研究它。 - blunders

8

在OpenOffice/LibreOffice Calc中,包含了在函数中使用正则表达式的功能。要启用此功能,请转到“工具”>“选项”>“Calc”>“计算”:Y = 在公式中启用正则表达式。


1
OpenOffice文档:Calc中的正则表达式 - piotr_cz

2

我尝试了几种解决方案,由于对VBA缺乏专业知识,大多数方案对我来说都太麻烦了。

我发现最容易的方法是使用SeoTools for Excel (http://nielsbosma.se/projects/seotools/)。

对我来说非常有效。


1
是的,这个工具真的很棒 ;) - Niels Bosma

0

--- 2014年2月 ---

为了提供另一种选择,Open OfficeLibre Office的Calc软件(即它们的电子表格软件名称)都允许在搜索功能中使用正则表达式。


不清楚你的回答为什么与主题相关或者为什么你的回答中有“FEB 2014”。请解释一下,谢谢。 - blunders
2
正如我的回答所述,这是一个更新的替代方案,而不是四年前所述和接受的功能。我加粗了月份和年份,以便那些浏览答案但不想搜索答案年龄的人能够快速找到。 - Robert Brisita

-1

我最近也有同样的问题。在尝试创造自己的工具并使它们正确运行后,我找到了一个非常易于使用的在线ADDIN。

这是创作者的摘录

在过去几个月的实习中,我一直在市场科学部门工作,而我的一部分工作就是将数据导入MS Access并生成报告。这包括从各种数据源获取潜在客户列表。这通常是一个相当简单的任务,涉及到一些基本的SQL查询。然而,有时我会收到一些地址等数据,这些数据不符合IT使用的任何标准格式。在最糟糕的情况下,数据是以pdf形式提供的,这意味着我只能将其导出到非定界文本文件中。我发现我真的需要一些通用的正则表达式函数来解析字段以导入MS Access。我在网上找到了一些.xla示例,但我真的想要一个更易于使用、更广泛和可移植的库。我还想包含一些基本模式,以便每次都不必重新发明轮子。

因此,我创建了一个名为Excel Add-In Regular Expressions.xla的简单Excel插件,它添加了几个自定义函数来实现标准的VBScript正则表达式。

这里是网站

我已经成功地使用它来使用正则表达式提取有用的文本。

以下是插件中的代码:

' Regular Expressions.xla
'
' ? 2010 Malcolm Poindexter
' This is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License
' as published by the Free Software Foundation, version 3.
' This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY;
' without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
' See the GNU General Public License for more details. http://www.gnu.org/licenses/gpl.html
'
' I would appreciate if you would notify me of any modifications or re-distributions of this code at contact@malcolmp.com
' and appropriately attribute me as the original author in the header.
' ------------------------------------------------------------------------------------------------------------------------
'
' This file provides an Excel Add-In containing regular expression processing functions and several pre-defined regular expressions.
' The regular expressions provided are not necessarially exhaustive, but are intended to cover the most common cases.
'
' Regular Expressions Syntax: http://msdn.microsoft.com/en-us/library/1400241x%28VS.85%29.aspx

' -----------------------------
' NAME: xREPLACE
' DESCRIPTION: Replace all portions of the search text matching the pattern with the replacement text.
' -----------------------------
Function xREPLACE(pattern As String, searchText As String, replacementText As String, Optional ignoreCase As Boolean = True) As String
    On Error Resume Next
    Dim RegEx As New RegExp
    RegEx.Global = True
    RegEx.MultiLine = True
    RegEx.pattern = pattern
    RegEx.ignoreCase = ignoreCase
    xREPLACE = RegEx.Replace(searchText, replacementText)
End Function

' -----------------------------
' NAME: xMATCHES
' DESCRIPTION: Find and return the number of matches to a pattern in the search text.
' -----------------------------
Function xMATCHES(pattern As String, searchText As String, Optional ignoreCase As Boolean = True) As String
    On Error Resume Next
    Dim RegEx As New RegExp
    RegEx.Global = True
    RegEx.MultiLine = True
    RegEx.pattern = pattern
    RegEx.ignoreCase = ignoreCase
    Dim matches As MatchCollection
    Set matches = RegEx.Execute(searchText)
    xMATCHES = matches.Count
End Function

' -----------------------------
' NAME: xMATCH
' DESCRIPTION: Find and return an instance of a match to the pattern in the search text. MatchIndex may be used in the case of multiple matches.
' -----------------------------
Function xMATCH(pattern As String, searchText As String, Optional matchIndex As Integer = 1, _
                Optional ignoreCase As Boolean = True) As String
    On Error Resume Next
    Dim RegEx As New RegExp
    RegEx.Global = True
    RegEx.MultiLine = True
    RegEx.pattern = pattern
    RegEx.ignoreCase = ignoreCase
    Dim matches As MatchCollection
    Set matches = RegEx.Execute(searchText)
    Dim i As Integer
    i = 1
    For Each Match In matches
        If i = matchIndex Then
            xMATCH = Match.Value
        End If
        i = i + 1
    Next
End Function

' -----------------------------
' NAME: xMATCHALL
' DESCRIPTION: Find and return a comma-separated list of all matches to the pattern in the search text.
' -----------------------------
Function xMATCHALL(pattern As String, searchText As String, Optional ignoreCase As Boolean = True) As String
    On Error Resume Next
    Dim RegEx As New RegExp
    RegEx.Global = True
    RegEx.MultiLine = True
    RegEx.pattern = pattern
    RegEx.ignoreCase = ignoreCase
    Dim matches As MatchCollection
    Set matches = RegEx.Execute(searchText)
    Dim i As Integer
    i = 1
    Dim returnMatches As String
    returnMatches = ""
    For Each Match In matches
        If i = 1 Then
            returnMatches = Match.Value
        Else
            returnMatches = returnMatches + "," + Match.Value
        End If
        i = i + 1
    Next
    xMATCHALL = returnMatches
End Function

' -----------------------------
' NAME: xGROUP
' DESCRIPTION: Find and return a group from within a matched pattern.
' -----------------------------
Function xGROUP(pattern As String, searchText As String, group As Integer, Optional matchIndex As Integer = 1, _
                Optional ignoreCase As Boolean = True) As String
    On Error Resume Next
    If group <> 0 Then
        group = group - 1
    End If
    Dim RegEx As New RegExp
    RegEx.Global = True
    RegEx.MultiLine = True
    RegEx.pattern = pattern
    RegEx.ignoreCase = ignoreCase
    Dim matches As MatchCollection
    Set matches = RegEx.Execute(searchText)
    Dim i As Integer
    i = 1
    For Each Match In matches
        If i = matchIndex Then
            xGROUP = Match.SubMatches(group)
        End If
        i = i + 1
    Next
End Function

' -----------------------------
' NAME: xSTARTSWITH
' DESCRIPTION: Returns true or false if the search text starts with the pattern.
' -----------------------------
Function xSTARTSWITH(pattern As String, searchText As String, Optional ignoreCase As Boolean = True) As String
    On Error Resume Next
    Dim RegEx As New RegExp
    RegEx.Global = True
    RegEx.MultiLine = True
    RegEx.pattern = "^" + pattern
    RegEx.ignoreCase = ignoreCase
    Dim matches As MatchCollection
    Set matches = RegEx.Execute(searchText)
    xSTARTSWITH = matches.Count > 0
End Function

' -----------------------------
' NAME: xENDSWITH
' DESCRIPTION: Returns true or false if the search text ends with the pattern.
' -----------------------------
Function xENDSWITH(pattern As String, searchText As String, Optional ignoreCase As Boolean = True) As String
    On Error Resume Next
    Dim RegEx As New RegExp
    RegEx.Global = True
    RegEx.MultiLine = True
    RegEx.pattern = pattern + "$"
    RegEx.ignoreCase = ignoreCase
    Dim matches As MatchCollection
    Set matches = RegEx.Execute(searchText)
    xENDSWITH = matches.Count > 0
End Function

' ************************************
' Regular Expression Definitions
' ************************************

' -----------------------------
' NAME: xxEMAIL
' DESCRIPTION: Pattern to match an email address.
' -----------------------------
Function xxEMAIL() As String
    xxEMAIL = "\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b"
End Function

' -----------------------------
' NAME: xxUSZIP
' DESCRIPTION: Pattern to match an US Zip code.
' -----------------------------
Function xxUSZIP() As String
    xxUSZIP = "\b(?!0{5})(\d{5})(?!-0{4})(-\d{4})?\b"
End Function

' -----------------------------
' NAME: xxPHONE
' DESCRIPTION: Pattern to match a phone number.
' -----------------------------
Function xxPHONE() As String
    xxPHONE = "\b[01]?[- .]?\(?[2-9]\d{2}\)?\s?[- .]?\s?\d{3}\s?[- .]?\s?\d{4}(\s*(x|(ext))[\.]?\s*\d{1,6})?\b"
End Function

' -----------------------------
' NAME: xxURL
' DESCRIPTION: Pattern to match a url.
' -----------------------------
Function xxURL() As String
    xxURL = "\b((ftp)|(https?))\://[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}\b"
End Function


' ************************************
'   Insert Function Dialog Category Setup
' ************************************
Sub AddCategoryDescription()
    Application.MacroOptions Macro:="xREPLACE", _
        Description:="Replace all portions of the search text matching the pattern with the replacement text.", _
        Category:="Regular Expressions"

    Application.MacroOptions Macro:="xMATCHES", _
        Description:="Find and return the number of matches to a pattern in the search text.", _
        Category:="Regular Expressions"

    Application.MacroOptions Macro:="xMATCH", _
        Description:="Find and return an instance of a match to the pattern in the search text. MatchIndex may be used in the case of multiple matches.", _
        Category:="Regular Expressions"

    Application.MacroOptions Macro:="xMATCHALL", _
        Description:="Find and return a comma-separated list of all matches to the pattern in the search text.", _
        Category:="Regular Expressions"

    Application.MacroOptions Macro:="xGROUP", _
        Description:="Find and return a group from within a matched pattern.", _
        Category:="Regular Expressions"

    Application.MacroOptions Macro:="xSTARTSWITH", _
        Description:="Returns true or false if the search text starts with the pattern.", _
        Category:="Regular Expressions"

    Application.MacroOptions Macro:="xENDSWITH", _
        Description:="Returns true or false if the search text ends with the pattern.", _
        Category:="Regular Expressions"

    '**** Regular Expressions ****

    Application.MacroOptions Macro:="xxEMAIL", _
        Description:="Pattern to match an email address.", _
        Category:="Regular Expressions"

    Application.MacroOptions Macro:="xxUSZIP", _
        Description:="Pattern to match an US Zip code.", _
        Category:="Regular Expressions"

    Application.MacroOptions Macro:="xxPHONE", _
        Description:="Pattern to match a phone number.", _
        Category:="Regular Expressions"

    Application.MacroOptions Macro:="xxURL", _
        Description:="Pattern to match a url.", _
        Category:="Regular Expressions"
End Sub

我在这里添加了代码,以使其看起来不像广告,这样您就可以看到它是一个有用且易于使用的库。 - Vijay

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接