Scala列表匹配正则表达式

Question

Scala列表匹配正则表达式

6

我有一个字符串列表和一个正则表达式模式。我想要过滤掉不符合正则表达式的项目。我正在使用以下代码，但似乎不起作用：

val matching = token.filter(x => regex.pattern.matcher(x).matches)

其中token是字符串列表，regex是我想匹配的模式。

- princess of persia

5

我认为这个问题完全适合在SO上提问，我很惊讶它被关闭而没有任何评论。它只需要一个示例字符串/正则表达式就可以成为一个完美、简洁的问题。 - Ken Williams

4个回答

3

对于完整性的另一种选择：

val words = List("alpha", "bravo", "charlie", "alphie")
words.filter(_.matches("a.*"))

res0: List[java.lang.String] = List(alpha, alphie)

- 7zark7

这会为列表中的每个元素重新解析正则表达式。 - Jean-Philippe Pellet

1

Have you tried it like:

val list = List("abc","efg","")
val p = java.util.regex.Pattern.compile(".*")

val matching = list filter { p.matcher(_).matches }

- korefn

1

当使用Scala的正则表达式引擎时，我遇到了一个问题，即.matches会尝试匹配整个字符串，而不是在每个可能的子字符串上进行匹配。

在许多正则表达式引擎中，以下代码将被评估为匹配：

"alphie".match(/a/)

在Scala中，使用.matches会失败；它将尝试将"a"与整个字符串"alphie"进行匹配。但是，如果正则表达式是/a*/，它将起作用，因为*字符将匹配零个或多个字符。

如果不能添加重复的正则表达式符号，则findAllIn方法可能很有用：

val words = List("alpha", "bravo", "charlie", "alphie")

val regex = "a.".r                                

//returns a tuple with the list item that matched, plus the text that fit the regex
for {
    word <- words
    matches <- regex.findAllIn(word)
} yield (word,matches)

注意：如果字符串中存在多个匹配项，则findAllIn可能会多次匹配特定字符串。

- nimda

是的，匹配意味着匹配！不是搜索或查找。 - Randall Schulz

我认为@nimda已经意识到了这一点，但令人惊讶的是，像regex.matchesSomewhere(_)这样概念上简单的概念在当前API中不存在。@nimda，一个选项是使用.unanchored方法。 - Ken Williams

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- user500592 · Accepted Answer

你的代码应该是可以工作的。你确定你的正则表达式正确吗？

val regex = "a.c".r
val tokens = List("abc", "axc", "abd", "azc")
tokens filter (x => regex.pattern.matcher(x).matches)
//result: List[String] = List(abc, axc, azc)

编辑：

根据您的正则表达式，请确保以下示例符合您的期望：

val regex = """\b[b-df-hj-np-tv-z]*[aeiou]+[b-df-hj-np-tv-z]*\b""".r

regex.pattern.matcher("good").matches
//res3: Boolean = true

regex.pattern.matcher("no good deed").matches
//res4: Boolean = false

matches 方法将尝试匹配整个字符串。