检查字符串是否为有效URL的最佳正则表达式是什么？

Question

检查字符串是否为有效URL的最佳正则表达式是什么？

regexurllanguage-agnostic

1038

如何检查给定的字符串是否是有效的URL地址？

我对正则表达式的了解很基础，并不能让我从已经在网上看到的成百上千个正则表达式中进行选择。

- vitorsilva

45

只翻译内容：任何URL还是只有HTTP？例如，mailto:me@example.com算作URL吗？还是像AIM聊天链接这样的也算？ - Mecki

6

如果一个URL没有以“http（等等）”开头，你怎么能把它与其他任意带有点的字符串区分开来？比如“MyClass.MyProperty.MyMethod”或者“I sometimes miss the spacebar. Is this a problem?” - Tomalak

15

微软有一个正则表达式页面，其中包括了一个用于URL的表达式。这是个不错的起点：http://msdn.microsoft.com/en-us/library/ff650303.aspx。注意：上述页面已过时，但表格中的表达式基本仍然有效供参考。建议使用的URL表达式（对我来说也非常好用）是： "^(ht|f)tp(s?)://0-9a-zA-Z(:(0-9))(/?)([a-zA-Z0-9-.?,'/\+&%$#_])?$" - CMH

65个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- kash · Answer 1

以下是来自Android源代码的现成的Java版本。这是我找到的最好的版本。

public static final Matcher WEB  = Pattern.compile(new StringBuilder()                 
.append("((?:(http|https|Http|Https|rtsp|Rtsp):")                      
.append("\\/\\/(?:(?:[a-zA-Z0-9\\$\\-\\_\\.\\+\\!\\*\\'\\(\\)")                         
.append("\\,\\;\\?\\&\\=]|(?:\\%[a-fA-F0-9]{2})){1,64}(?:\\:(?:[a-zA-Z0-9\\$\\-\\_")                         
.append("\\.\\+\\!\\*\\'\\(\\)\\,\\;\\?\\&\\=]|(?:\\%[a-fA-F0-9]{2})){1,25})?\\@)?)?")                         
.append("((?:(?:[a-zA-Z0-9][a-zA-Z0-9\\-]{0,64}\\.)+")   // named host                            
.append("(?:")   // plus top level domain                         
.append("(?:aero|arpa|asia|a[cdefgilmnoqrstuwxz])")                         
.append("|(?:biz|b[abdefghijmnorstvwyz])")                         
.append("|(?:cat|com|coop|c[acdfghiklmnoruvxyz])")                         
.append("|d[ejkmoz]")                         
.append("|(?:edu|e[cegrstu])")                         
.append("|f[ijkmor]")                         
.append("|(?:gov|g[abdefghilmnpqrstuwy])")                         
.append("|h[kmnrtu]")                         
.append("|(?:info|int|i[delmnoqrst])")                         
.append("|(?:jobs|j[emop])")                         
.append("|k[eghimnrwyz]")                         
.append("|l[abcikrstuvy]")                         
.append("|(?:mil|mobi|museum|m[acdghklmnopqrstuvwxyz])")                         
.append("|(?:name|net|n[acefgilopruz])")                         
.append("|(?:org|om)")                         
.append("|(?:pro|p[aefghklmnrstwy])")                         
.append("|qa")                         
.append("|r[eouw]")                         
.append("|s[abcdeghijklmnortuvyz]")                         
.append("|(?:tel|travel|t[cdfghjklmnoprtvwz])")                         
.append("|u[agkmsyz]")                         
.append("|v[aceginu]")                         
.append("|w[fs]")                         
.append("|y[etu]")                         
.append("|z[amw]))")                         
.append("|(?:(?:25[0-5]|2[0-4]") // or ip address                                                 
.append("[0-9]|[0-1][0-9]{2}|[1-9][0-9]|[1-9])\\.(?:25[0-5]|2[0-4][0-9]")                             
.append("|[0-1][0-9]{2}|[1-9][0-9]|[1-9]|0)\\.(?:25[0-5]|2[0-4][0-9]|[0-1]")                         
.append("[0-9]{2}|[1-9][0-9]|[1-9]|0)\\.(?:25[0-5]|2[0-4][0-9]|[0-1][0-9]{2}")                         
.append("|[1-9][0-9]|[0-9])))")                         
.append("(?:\\:\\d{1,5})?)") // plus option port number                             
.append("(\\/(?:(?:[a-zA-Z0-9\\;\\/\\?\\:\\@\\&\\=\\#\\~")  // plus option query params                         
.append("\\-\\.\\+\\!\\*\\'\\(\\)\\,\\_])|(?:\\%[a-fA-F0-9]{2}))*)?")                         
.append("(?:\\b|$)").toString()                 
).matcher("");

- ridgerunner · Answer 2

我一直在撰写一篇深入的文章，讨论使用正则表达式进行URI验证。它基于RFC3986。

正则表达式URI验证

尽管这篇文章还没有完成，但我已经想出了一个PHP函数，可以相当好地验证HTTP和FTP URL。以下是当前版本：

// function url_valid($url) { Rev:20110423_2000
//
// Return associative array of valid URI components, or FALSE if $url is not
// RFC-3986 compliant. If the passed URL begins with: "www." or "ftp.", then
// "http://" or "ftp://" is prepended and the corrected full-url is stored in
// the return array with a key name "url". This value should be used by the caller.
//
// Return value: FALSE if $url is not valid, otherwise array of URI components:
// e.g.
// Given: "http://www.jmrware.com:80/articles?height=10&width=75#fragone"
// Array(
//    [scheme] => http
//    [authority] => www.jmrware.com:80
//    [userinfo] =>
//    [host] => www.jmrware.com
//    [IP_literal] =>
//    [IPV6address] =>
//    [ls32] =>
//    [IPvFuture] =>
//    [IPv4address] =>
//    [regname] => www.jmrware.com
//    [port] => 80
//    [path_abempty] => /articles
//    [query] => height=10&width=75
//    [fragment] => fragone
//    [url] => http://www.jmrware.com:80/articles?height=10&width=75#fragone
// )
function url_valid($url) {
    if (strpos($url, 'www.') === 0) $url = 'http://'. $url;
    if (strpos($url, 'ftp.') === 0) $url = 'ftp://'. $url;
    if (!preg_match('/# Valid absolute URI having a non-empty, valid DNS host.
        ^
        (?P<scheme>[A-Za-z][A-Za-z0-9+\-.]*):\/\/
        (?P<authority>
          (?:(?P<userinfo>(?:[A-Za-z0-9\-._~!$&\'()*+,;=:]|%[0-9A-Fa-f]{2})*)@)?
          (?P<host>
            (?P<IP_literal>
              \[
              (?:
                (?P<IPV6address>
                  (?:                                                (?:[0-9A-Fa-f]{1,4}:){6}
                  |                                                ::(?:[0-9A-Fa-f]{1,4}:){5}
                  | (?:                          [0-9A-Fa-f]{1,4})?::(?:[0-9A-Fa-f]{1,4}:){4}
                  | (?:(?:[0-9A-Fa-f]{1,4}:){0,1}[0-9A-Fa-f]{1,4})?::(?:[0-9A-Fa-f]{1,4}:){3}
                  | (?:(?:[0-9A-Fa-f]{1,4}:){0,2}[0-9A-Fa-f]{1,4})?::(?:[0-9A-Fa-f]{1,4}:){2}
                  | (?:(?:[0-9A-Fa-f]{1,4}:){0,3}[0-9A-Fa-f]{1,4})?::   [0-9A-Fa-f]{1,4}:
                  | (?:(?:[0-9A-Fa-f]{1,4}:){0,4}[0-9A-Fa-f]{1,4})?::
                  )
                  (?P<ls32>[0-9A-Fa-f]{1,4}:[0-9A-Fa-f]{1,4}
                  | (?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}
                       (?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
                  )
                |   (?:(?:[0-9A-Fa-f]{1,4}:){0,5}[0-9A-Fa-f]{1,4})?::   [0-9A-Fa-f]{1,4}
                |   (?:(?:[0-9A-Fa-f]{1,4}:){0,6}[0-9A-Fa-f]{1,4})?::
                )
              | (?P<IPvFuture>[Vv][0-9A-Fa-f]+\.[A-Za-z0-9\-._~!$&\'()*+,;=:]+)
              )
              \]
            )
          | (?P<IPv4address>(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}
                               (?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?))
          | (?P<regname>(?:[A-Za-z0-9\-._~!$&\'()*+,;=]|%[0-9A-Fa-f]{2})+)
          )
          (?::(?P<port>[0-9]*))?
        )
        (?P<path_abempty>(?:\/(?:[A-Za-z0-9\-._~!$&\'()*+,;=:@]|%[0-9A-Fa-f]{2})*)*)
        (?:\?(?P<query>       (?:[A-Za-z0-9\-._~!$&\'()*+,;=:@\\/?]|%[0-9A-Fa-f]{2})*))?
        (?:\#(?P<fragment>    (?:[A-Za-z0-9\-._~!$&\'()*+,;=:@\\/?]|%[0-9A-Fa-f]{2})*))?
        $
        /mx', $url, $m)) return FALSE;
    switch ($m['scheme']) {
    case 'https':
    case 'http':
        if ($m['userinfo']) return FALSE; // HTTP scheme does not allow userinfo.
        break;
    case 'ftps':
    case 'ftp':
        break;
    default:
        return FALSE;   // Unrecognized URI scheme. Default to FALSE.
    }
    // Validate host name conforms to DNS "dot-separated-parts".
    if ($m['regname']) { // If host regname specified, check for DNS conformance.
        if (!preg_match('/# HTTP DNS host name.
            ^                      # Anchor to beginning of string.
            (?!.{256})             # Overall host length is less than 256 chars.
            (?:                    # Group dot separated host part alternatives.
              [A-Za-z0-9]\.        # Either a single alphanum followed by dot
            |                      # or... part has more than one char (63 chars max).
              [A-Za-z0-9]          # Part first char is alphanum (no dash).
              [A-Za-z0-9\-]{0,61}  # Internal chars are alphanum plus dash.
              [A-Za-z0-9]          # Part last char is alphanum (no dash).
              \.                   # Each part followed by literal dot.
            )*                     # Zero or more parts before top level domain.
            (?:                    # Explicitly specify top level domains.
              com|edu|gov|int|mil|net|org|biz|
              info|name|pro|aero|coop|museum|
              asia|cat|jobs|mobi|tel|travel|
              [A-Za-z]{2})         # Country codes are exactly two alpha chars.
              \.?                  # Top level domain can end in a dot.
            $                      # Anchor to end of string.
            /ix', $m['host'])) return FALSE;
    }
    $m['url'] = $url;
    for ($i = 0; isset($m[$i]); ++$i) unset($m[$i]);
    return $m; // return TRUE == array of useful named $matches plus the valid $url.
}

该函数使用两个正则表达式：一个用于匹配一些有效的通用URI（绝对URI具有非空主机），另一个用于验证DNS“点分隔部分”主机名。虽然该函数目前只验证HTTP和FTP方案，但它的结构使其可以轻松扩展以处理其他方案。

- Mikael Engver · Answer 3

我使用这个正则表达式：

((https?:)?//)?(([\d\w]|%[a-fA-f\d]{2,2})+(:([\d\w]|%[a-fA-f\d]{2,2})+)?@)?([\d\w][-\d\w]{0,253}[\d\w]\.)+[\w]{2,63}(:[\d]+)?(/([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)*(\?(&?([-+_~.\d\w]|%[a-fA-f\d]{2,2})=?)*)?(#([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)?

支持以下两种方式：

http://stackoverflow.com
https://stackoverflow.com

And:

//stackoverflow.com

- Shantonu · Answer 4

这个对我非常有效。 (https?|ftp)://(www\d?|[a-zA-Z0-9]+)?\.[a-zA-Z0-9-]+(\:|\.)([a-zA-Z0-9.]+|(\d+)?)([/?:].*)?

- Ewan · Answer 5

针对Python，这是Django 1.5.1中实际使用的URL验证正则表达式：

import re
regex = re.compile(
        r'^(?:http|ftp)s?://'  # http:// or https://
        r'(?:(?:[A-Z0-9](?:[A-Z0-9-]{0,61}[A-Z0-9])?\.)+(?:[A-Z]{2,6}\.?|[A-Z0-9-]{2,}\.?)|'  # domain...
        r'localhost|'  # localhost...
        r'\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}|'  # ...or ipv4
        r'\[?[A-F0-9]*:[A-F0-9:]+\]?)'  # ...or ipv6
        r'(?::\d+)?'  # optional port
        r'(?:/?|[/?]\S+)$', re.IGNORECASE)

这段代码可以同时支持ipv4和ipv6地址，还可以对端口和GET参数进行处理。

该代码在此处发现：第44行。

- maxspan · Answer 6

匹配URL有多种选择，取决于您的需求。以下是几个选项。

_(^|[\s.:;?\-\]<\(])(https?://[-\w;/?:@&=+$\|\_.!~*\|'()\[\]%#,☺]+[\w/#](\(\))?)(?=$|[\s',\|\(\).:;?\-\[\]>\)])_i

#\b(([\w-]+://?|www[.])[^\s()<>]+(?:\([\w\d]+\)|([^[:punct:]\s]|/)))#iS

这里有一个链接，提供了超过10种验证URL的不同方法。

https://mathiasbynens.be/demo/url-regex

- Rahul Desai · Answer 7

我找到了以下用于URL的正则表达式，已成功测试500多个URL：

/\b(?:(?:https?|ftp):\/\/)(?:\S+(?::\S*)?@)?(?:(?!10(?:\.\d{1,3}){3})(?!127(?:\.\d{1,3}){3})(?!169\.254(?:\.\d{1,3}){2})(?!192\.168(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\x{00a1}-\x{ffff}0-9]+-?)*[a-z\x{00a1}-\x{ffff}0-9]+)(?:\.(?:[a-z\x{00a1}-\x{ffff}0-9]+-?)*[a-z\x{00a1}-\x{ffff}0-9]+)*(?:\.(?:[a-z\x{00a1}-\x{ffff}]{2,})))(?::\d{2,5})?(?:\/[^\s]*)?\b/gi

我知道它看起来很丑，但好的一点是它能够工作。 :) 在regex101上使用581个随机URL进行解释和演示。来源：寻找完美的URL验证正则表达式

- miphe · Answer 8

为了方便起见，这里提供一个正则表达式的一行代码，用于匹配URL，也会匹配本地主机 localhost，在这里你更有可能拥有端口，而不是像 .com 这样的域名。

(http(s)?:\/\/.)?(www\.)?[-a-zA-Z0-9@:%._\+~#=]{2,256}(\.[a-z]{2,6}|:[0-9]{3,4})\b([-a-zA-Z0-9@:%_\+.~#?&\/\/=]*)

- Eli O. · Answer 9

2023年及未来的更新、国际化和现代化解决方案

覆盖现代浏览器支持的99%+ URL，包括：

域名和路径中的表情符号
对顶级域名的强制要求：第一个字符应为字母，但后面可以是数字, 最少2个字符, 最多理论上的63个字符，同时允许下划线, 带重音的字符和国际化字符
世界上最常见的语言字母、字符和变音符号在路径和域名中。包括英语（当然），法语、西班牙语、葡萄牙语、德语、意大利语、孟加拉语、天城文、马拉地语、约99%的中文（CJK）、日语、韩语、台湾语、越南语、乌尔都语、阿拉伯语、希腊语、西里尔字母以及所有与这些字符集共享字符集的其他语言
添加了支持IPv4 URL的版本，这些URL也在网上使用和支持，并得到所有现代浏览器的支持，同时提供了有和没有IPv4 URL支持的版本

使用这个来包含协议（“http/https”）和IPv4 URL：

https?:\/\/(\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?):\d{1,5}\b|([-a-zA-Z0-9\u1F60-\uFFFF\u1F60-\uFFFF\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u024F@:%._\+~#=]{1,256})\.([a-zA-Z][a-zA-Z0-9\u1F60-\uFFFF\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u024F\u0370-\u03ff\u1f00-\u1fff\u0400-\u04ff()-]{1,62}))\b([\/#][-a-zA-Z0-9\u1F60-\uFFFF\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u024F\u0370-\u03ff\u1f00-\u1fff\u0400-\u04ff\u0900-\u097F\u0600-\u06FF\u0985-\u0994\u0995-\u09a7\u09a8-\u09ce\u0981\u0982\u0983\u09e6-\u09ef\u0750-\u077F\uFB50-\uFDFF\uFE70-\uFEFF\u4E00-\u9FFFẸɓɗẹỊỌịọṢỤṣụ()@:%_\+.~#?&//=\[\]!\$'*+,;]*)?

使用此选项可包含协议但不包括IPv4地址：

https?:\/\/([-a-zA-Z0-9\u1F60-\uFFFF\u1F60-\uFFFF\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u024F@:%._\+~#=]{1,256})\.([a-zA-Z][a-zA-Z0-9\u1F60-\uFFFF\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u024F\u0370-\u03ff\u1f00-\u1fff\u0400-\u04ff()-]{1,62})\b([\/#][-a-zA-Z0-9\u1F60-\uFFFF\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u024F\u0370-\u03ff\u1f00-\u1fff\u0400-\u04ff\u0900-\u097F\u0600-\u06FF\u0985-\u0994\u0995-\u09a7\u09a8-\u09ce\u0981\u0982\u0983\u09e6-\u09ef\u0750-\u077F\uFB50-\uFDFF\uFE70-\uFEFF\u4E00-\u9FFFẸɓɗẹỊỌịọṢỤṣụ()@:%_\+.~#?&//=\[\]!\$'*+,;]*)?

用于匹配域名和路径（不包括IPv4）：

([-a-zA-Z0-9\u1F60-\uFFFF\u1F60-\uFFFF\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u024F@:%._\+~#=]{1,256})\.([a-zA-Z][a-zA-Z0-9\u1F60-\uFFFF\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u024F\u0370-\u03ff\u1f00-\u1fff\u0400-\u04ff()-]{1,62})\b([\/#][-a-zA-Z0-9\u1F60-\uFFFF\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u024F\u0370-\u03ff\u1f00-\u1fff\u0400-\u04ff\u0900-\u097F\u0600-\u06FF\u0985-\u0994\u0995-\u09a7\u09a8-\u09ce\u0981\u0982\u0983\u09e6-\u09ef\u0750-\u077F\uFB50-\uFDFF\uFE70-\uFEFF\u4E00-\u9FFFẸɓɗẹỊỌịọṢỤṣụ()@:%_\+.~#?&//=\[\]!\$'*+,;]*)?

在所有这些现实世界和理论边缘案例中经过了实战检验：

// Check with a simple copy/paste in console!

const regexURLsAndIPs =
  /^https?:\/\/(\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?):\d{1,5}\b|([-a-zA-Z0-9\u1F60-\uFFFF\u1F60-\uFFFF\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u024F@:%._\+~#=]{1,256})\.([a-zA-Z][a-zA-Z0-9\u1F60-\uFFFF\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u024F\u0370-\u03ff\u1f00-\u1fff\u0400-\u04ff()-]{1,62}))\b([\/#][-a-zA-Z0-9\u1F60-\uFFFF\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u024F\u0370-\u03ff\u1f00-\u1fff\u0400-\u04ff\u0900-\u097F\u0600-\u06FF\u0985-\u0994\u0995-\u09a7\u09a8-\u09ce\u0981\u0982\u0983\u09e6-\u09ef\u0750-\u077F\uFB50-\uFDFF\uFE70-\uFEFF\u4E00-\u9FFFẸɓɗẹỊỌịọṢỤṣụ()@:%_\+.~#?&//=\[\]!\$'*+,;]*)?$/

const shouldMatch = [
  'https://base.com/',
  'http://t.co',
  'https://www.google.com.ua/',
  'https://subdomains.as.deep.as.you.want.example.com',
  'https://sub.second_leveldomain_underscore.verylongtoplevedomain/nice',
  'https://domain-name.com/path-common-characters/ABCxyz01789',
  'http://domaîn-with-àccents.ca',
  'http://path-with-accents.com/àèìòùçÇßØøÅåÆæœ',
  'https://en.wikipedia.org/wiki/Möbius_strip',
  'http://www..tld/emojis--in--domain/-and-path-/',
  'https://y.at/',
  'https://hashtag.forpath#lets-go',
  "http://special.com/all-special-characters-._~:/?#[]@!$&'()*+,;=",
  'https://greek_with_diacritics.co/ΑαΒβΣσ/ςΤτϋΰήώΊΪΌΆΈΎΫΉΏᾶἀ',
  'https://el.wikipedia.org/wiki/Ποσειδώνας_(πλανήτης)',
  'http://cyrillic-and-extras.ru/АаБбВвЪъыӸӹЫЯЯяѶѷ',
  'https://ru.wikipedia.org/wiki/Заглавная_страница',
  'https://most-arabic.co/گچپژیلفقهموء-يجريبتج/',
  'https://urdu.co/حروفِ/',
  'https://nigerian.ni/ƁƊƎẸɓɗǝẹỊƘỌịƙọṢỤṣụ',
  'https://bengali.sports.co/স্পর্শঅনুনাসিকলসওষ্ঠ্যপফবভম/',
  'https://devenagri.cc/कखगघङचछजझञटठडढणतथदधनपफबभमयरलवशषस',
  'https://h.org/wiki/Wikipedia:关于中文维基百科/en',
  'https://zh.wikipedia.org/wiki/Wikipedia:关于中文维基百科/en',
  'http://others.kr/korean-안녕ㆅㅇㄹㅿㆍㅡㅣㅗㅑㅠㅕ/japanese-一龠ぁゔａｚＡＺ０９々〆〤ヶ',
  'https://龠.subdomain.com',
  'http://127.0.0.1:22/valid-ip',
  'http://127.00.00.01:22/ugly-but-still-works-with-modern-browsers',
  'http://0.0.0.0:0/is-min',
  'https://255.255.255.255:0/is-max',
  'https://this.tld-is-63-characters-wich-is-the-theoretical-limit-000000000000',
]

const shouldNotMatch = [
  'noprotocol.com',
  ' https://space-in-front.com',
  'https://invalid.0om',
  'https://invalid.-om',
  'https://invalid-single-letter-tld.c',
  'https://invalid-domain&char.com',
  'https://invalid:com',
  'https://not valid.com',
  'https://not,valid.com',
  'https://龠.c龠',
  'https://invalidαΒβΣσ.com',
  'notvalid://www.google.com',
  'http://missing-tld',
  'https://0.0.0.0missing-port',
  '0.0.0.0:0/missing-protocol',
  'https://256.255.255.255:0/is-above-max',
  'https://this.tld-is-64-characters-which-is-too-looooooooooooooooooooooooooong',
]

function checkStringsMatchRegex(regex, array, shouldMatch = true) {
  for (let i = 0; i < array.length; i++) {
    if (regex.test(array[i]) !== shouldMatch) {
      const matchStr = shouldMatch ? 'match' : 'not match'
      console.error('regex.test(array[i])', regex.test(array[i]))
      throw new Error(`String "${array[i]}" should ${matchStr} regex "${regex}"`)
    }
  }
  const successMatchingStr = shouldMatch ? 'matching all strings' : 'not matching a single string'
  console.log(`Success with ${successMatchingStr} in the test array.`)
}

checkStringsMatchRegex(regexURLsAndIPs, shouldMatch, true)
checkStringsMatchRegex(regexURLsAndIPs, shouldNotMatch, false)

- Ashish · Answer 10

我尝试制定我的URL版本。我的要求是在字符串中捕获可能的URL实例，例如cse.uom.ac.mu - 注意它前面没有http或www。

String regularExpression = "((((ht{2}ps?://)?)((w{3}\\.)?))?)[^.&&[a-zA-Z0-9]][a-zA-Z0-9.-]+[^.&&[a-zA-Z0-9]](\\.[a-zA-Z]{2,3})";

assertTrue("www.google.com".matches(regularExpression));
assertTrue("www.google.co.uk".matches(regularExpression));
assertTrue("http://www.google.com".matches(regularExpression));
assertTrue("http://www.google.co.uk".matches(regularExpression));
assertTrue("https://www.google.com".matches(regularExpression));
assertTrue("https://www.google.co.uk".matches(regularExpression));
assertTrue("google.com".matches(regularExpression));
assertTrue("google.co.uk".matches(regularExpression));
assertTrue("google.mu".matches(regularExpression));
assertTrue("mes.intnet.mu".matches(regularExpression));
assertTrue("cse.uom.ac.mu".matches(regularExpression));

//cannot contain 2 '.' after www
assertFalse("www..dr.google".matches(regularExpression));

//cannot contain 2 '.' just before com
assertFalse("www.dr.google..com".matches(regularExpression));

// to test case where url www must be followed with a '.'
assertFalse("www:google.com".matches(regularExpression));

// to test case where url www must be followed with a '.'
//assertFalse("http://wwwe.google.com".matches(regularExpression));

// to test case where www must be preceded with a '.'
assertFalse("https://www@.google.com".matches(regularExpression));