JavaScript正则表达式循环匹配所有结果

Question

JavaScript正则表达式循环匹配所有结果

40

我正在尝试使用 Stack Overflow 的富文本编辑器进行类似操作。给定此文本：

[Text Example][1]

[1][http://www.example.com]

我想要循环处理每一个找到的 [string][int]，我是这样做的：

var Text = "[Text Example][1]\n[1][http: //www.example.com]";
// Find resource links
var arrMatch = null;
var rePattern = new RegExp(
  "\\[(.+?)\\]\\[([0-9]+)\\]",
  "gi"
);
while (arrMatch = rePattern.exec(Text)) {
  console.log("ok");
}

这很好用，对于每个[string][int]都会弹出“ok”。但我需要做的是，对于找到的每个匹配项，将初始匹配项替换为第二个匹配项的组成部分。

因此，在循环中，$2将表示最初匹配的int部分，并且我将运行此正则表达式（伪代码）。

while (arrMatch = rePattern.exec(Text)) {
    var FindIndex = $2; // This would be 1 in our example
    new RegExp("\\[" + FindIndex + "\\]\\[(.+?)\\]", "g")

    // Replace original match now with hyperlink
}

这将匹配

[1][http://www.example.com]

第一个示例的最终结果将是：

<a href="http://www.example.com" rel="nofollow">Text Example</a>

编辑

我已经完成了这一步骤:

var Text = "[Text Example][1]\n[1][http: //www.example.com]";
// Find resource links
reg = new RegExp(
  "\\[(.+?)\\]\\[([0-9]+)\\]",
  "gi");
var result;
while ((result = reg.exec(Text)) !== null) {
  var LinkText = result[1];
  var Match = result[0];
  Text = Text.replace(new RegExp(Match, "g"), '<a href="#">" + LinkText + "</a>');
}
console.log(Text);

- Tom Gullen

7个回答

33

最终我用以下方法完成了：

var Text = "[Text Example][1]\n[1][http: //www.example.com]";
// Find resource links
reg = new RegExp(
  "\\[(.+?)\\]\\[([0-9]+)\\]",
  "gi");
var result;
while (result = reg.exec(Text)) {
  var LinkText = result[1];
  var Match = result[0];
  var LinkID = result[2];
  var FoundURL = new RegExp("\\[" + LinkID + "\\]\\[(.+?)\\]", "g").exec(Text);
  Text = Text.replace(Match, '<a href="' + FoundURL[1] + '" rel="nofollow">' + LinkText + '</a>');
}
console.log(Text);

- Tom Gullen

6

在这里，我们使用exec方法，它可以帮助获取所有匹配项（通过while循环），并获取匹配字符串的位置。

    var input = "A 3 numbers in 333";
    var regExp = /\b(\d+)\b/g, match;
    while (match = regExp.exec(input))
      console.log("Found", match[1], "at", match.index);
    // → Found 3 at 2 //   Found 333 at 15

- Vasyl Gutnyk

这真的很有用。 - PeterT

1

使用反向引用来限制匹配，以便代码能够匹配您的文本，如果您的文本是：

[Text Example][1]\n[1][http://www.example.com]

并且，如果您的文本是以下内容，则代码将不匹配：

[Text Example][1]\n[2][http://www.example.com]

var re = /\[(.+?)\]\[([0-9]+)\s*.*\s*\[(\2)\]\[(.+?)\]/gi;
var str = '[Text Example][1]\n[1][http://www.example.com]';
var subst = '<a href="$4">$1</a>';

var result = str.replace(re, subst);
console.log(result);

\number 在正则表达式中用于引用分组匹配的编号，而$number 同样用于替换功能中引用分组结果。

- Ruslan López

0

另一种在不依赖exec和match细节的情况下迭代所有匹配项的方法是使用字符串替换函数，将正则表达式作为第一个参数，将函数作为第二个参数。当像这样使用时，函数参数将接收整个匹配项作为第一个参数，分组匹配项作为下一个参数，索引作为最后一个参数：

var text = "[Text Example][1]\n[1][http: //www.example.com]";
// Find resource links
var arrMatch = null;
var rePattern = new RegExp("\\[(.+?)\\]\\[([0-9]+)\\]", "gi");
text.replace(rePattern, function(match, g1, g2, index){
    // Do whatever
})

你甚至可以使用全局JS变量arguments迭代每个匹配的所有组（group），但要排除第一个和最后一个组。

- Mario Vázquez

0

这种格式基于Markdown。有几个JavaScript端口可用。如果您不想要整个语法，则建议窃取与链接相关的部分。

- Jason McCreary

1

谢谢。虽然我一直在谷歌上搜索如何循环每次匹配并进行另一个匹配，但我仍然想学习如何实现这一点。 - Tom Gullen

好的。看起来其他答案提供了代码。 - Jason McCreary

-3

我知道这篇文章有点老了，但是既然我偶然看到了它，我想澄清一下事情。

首先，你解决这个问题的思路太复杂了。当本应该简单的问题变得过于复杂时，就是停下来思考出了什么问题的时候了。其次，你的解决方案非常低效，因为你首先尝试找到要替换的内容，然后再在同一段文本中查找引用链接信息。因此，计算复杂度最终变成了O(n^2)。

看到这么多人对错误的东西投赞成票真是令人失望，因为来到这里的人大多数都是从被接受的解决方案中学习，认为这似乎是合法的答案，并将这个概念用于他们的项目中，结果就会变成一个非常糟糕的实现产品。

解决这个问题的方法非常简单。你需要做的只是在文本中找到所有引用链接，将它们保存为字典，然后再使用字典查找要替换的占位符。就是这样。这很简单！而且在这种情况下，你只需要O(n)的复杂度。

所以，解决这个问题的方法如下：

const text = `
 [2][https://en.wikipedia.org/wiki/Scientific_journal][5][https://en.wikipedia.org/wiki/Herpetology]

The Wells and Wellington affair was a dispute about the publication of three papers in the Australian Journal of [Herpetology][5] in 1983 and 1985. The publication was established in 1981 as a [peer-reviewed][1] [scientific journal][2] focusing on the study of [3][https://en.wikipedia.org/wiki/Amphibian][amphibians][3] and [reptiles][4] ([herpetology][5]). Its first two issues were published under the editorship of Richard W. Wells, a first-year biology student at Australia's University of New England. Wells then ceased communicating with the journal's editorial board for two years before suddenly publishing three papers without peer review in the journal in 1983 and 1985. Coauthored by himself and high school teacher Cliff Ross Wellington, the papers reorganized the taxonomy of all of Australia's and New Zealand's [amphibians][3] and [reptiles][4] and proposed over 700 changes to the binomial nomenclature of the region's herpetofauna.
[1][https://en.wikipedia.org/wiki/Academic_peer_review]    
[4][https://en.wikipedia.org/wiki/Reptile]          
`;

const linkRefs = {};
const linkRefPattern = /\[(?<id>\d+)\]\[(?<link>[^\]]+)\]/g;
const linkPlaceholderPattern = /\[(?<text>[^\]]+)\]\[(?<refid>\d+)\]/g;

const parsedText = text
    .replace(linkRefPattern, (...[,,,,,ref]) => (linkRefs[ref.id] = ref.link, ''))
    .replace(linkPlaceholderPattern, (...[,,,,,placeholder]) => `<a href="${linkRefs[placeholder.refid]}">${placeholder.text}</a>`)
    .trim();

console.log(parsedText);

- Slavik Meltser

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- s4y · Accepted Answer

我同意Jason的看法，使用现有的Markdown库会更快/更安全，但你正在寻找String.prototype.replace（而且，请使用RegExp字面量！）：

var Text = "[Text Example][1]\n[1][http: //www.example.com]";
var rePattern = /\[(.+?)\]\[([0-9]+)\]/gi;

console.log(Text.replace(rePattern, function(match, text, urlId) {
  // return an appropriately-formatted link
  return `<a href="${urlId}">${text}</a>`;
}));