正则表达式匹配字母数字字符、下划线、句点和破折号，只允许在中间使用句点和破折号。

Question

正则表达式匹配字母数字字符、下划线、句点和破折号，只允许在中间使用句点和破折号。

5

目前，我正在使用这个：

if (preg_match ('/^[a-zA-Z0-9_]+([a-zA-Z0-9_]*[.-]?[a-zA-Z0-9_]*)*[a-zA-Z0-9_]+$/', $product) ) {
    return true;
} else { 
    return false
}

例如，我想匹配以下内容：

pro.duct-name_
_pro.duct.name
p.r.o.d_u_c_t.n-a-m-e
product.-name
____pro.-_-.d___uct.nam._-e

但我不想匹配以下内容：

pro..ductname
.productname-
-productname.
-productname

- banskt

编辑了示例，使其更易理解。是否需要进一步解释，请告诉我，我很乐意进一步澄清。 - banskt

为什么 pro..ductname 不匹配？难道不是点号在中间吗？ - Ja͢ck

如果“dot”不会出现两次或任何字符呢？ - Cylian

因为我不想连续匹配 dot 或 dash 两次。Dot 和 dash 可以在中间出现多次，但不能连续。那么，如果 dot 和 dash 相互之后出现了怎么办？我们允许 product.-name。 - banskt

问：如果只有“点”不会出现两次或任何字符呢？答：点和短划线不会出现两次，任何其他字母数字字符可以出现两次，ppppppppp应该匹配。 - banskt

5个回答

3

试试这个

(?im)^([a-z_][\w\.\-]+)(?![\.\-])\b

更新 1

(?im)^([a-z_](?:[\.\-]\w|\w)+(?![\.\-]))$

更新2

(?im)^([a-z_](?:\.\-\w|\-\.\w|\-\w|\.\w|\w)+)$

解释

<!--
(?im)^([a-z_](?:\.\-\w|\-\.\w|\-\w|\.\w|\w)+)$

Match the remainder of the regex with the options: case insensitive (i); ^ and $ match at line breaks (m) «(?im)»
Assert position at the beginning of a line (at beginning of the string or after a line break character) «^»
Match the regular expression below and capture its match into backreference number 1 «([a-z_](?:\.\-\w|\-\.\w|\-\w|\.\w|\w)+)»
   Match a single character present in the list below «[a-z_]»
      A character in the range between “a” and “z” «a-z»
      The character “_” «_»
   Match the regular expression below «(?:\.\-\w|\-\.\w|\-\w|\.\w|\w)+»
      Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
      Match either the regular expression below (attempting the next alternative only if this one fails) «\.\-\w»
         Match the character “.” literally «\.»
         Match the character “-” literally «\-»
         Match a single character that is a “word character” (letters, digits, and underscores) «\w»
      Or match regular expression number 2 below (attempting the next alternative only if this one fails) «\-\.\w»
         Match the character “-” literally «\-»
         Match the character “.” literally «\.»
         Match a single character that is a “word character” (letters, digits, and underscores) «\w»
      Or match regular expression number 3 below (attempting the next alternative only if this one fails) «\-\w»
         Match the character “-” literally «\-»
         Match a single character that is a “word character” (letters, digits, and underscores) «\w»
      Or match regular expression number 4 below (attempting the next alternative only if this one fails) «\.\w»
         Match the character “.” literally «\.»
         Match a single character that is a “word character” (letters, digits, and underscores) «\w»
      Or match regular expression number 5 below (the entire group fails if this one fails to match) «\w»
         Match a single character that is a “word character” (letters, digits, and underscores) «\w»
Assert position at the end of a line (at the end of the string or before a line break character) «$»
-->

你可以在这里测试它。

- Cylian

1

\w 不等同于 [a-zA-Z0-9_]。 - Walter Tross

@WalterTross：您能提供一个示例，说明它不起作用的情况吗？因为该测试对于OP的示例数据有效。 - Herbert

需要匹配product.-name（请参见OP的评论），但实际上并没有匹配。根据OP的第一个正则表达式，123product也应该匹配。部分(?![.-])是不必要的，因为它已经被前面的内容所隐含。[.-]可以更清晰地表示为[-.]。 - Walter Tross

1

@WalterTross：感谢您指出这一点。请查看我的更新2。同样的，确实没有必要使用额外的“负向先行断言”。如果需要匹配“123product”产品，则模式将更简单，只需将第一个“字符类”替换为“\w”。OP从未对此发表评论。 - Cylian

@banskt：不客气。这解决了你的问题吗？如果需要更新，请告诉我。 - Cylian

显示剩余2条评论

1

这个应该可以：

/^[A-z0-9_]([.-]?[A-Z0-9_]+)*[.-]?[A-z0-9_]$/

它将确保单词以字母数字或下划线字符开头和结尾。中间的括号将确保最多只有一个连续的句点或破折号，然后至少跟随一个字母数字或下划线字符。

- domvoyt

0

以下正则表达式将检查任何包含字符、数字、破折号等的字符串，并且仅有一个点在中间。

/^[A-Za-z0-9_-]+(\.){1}[A-Za-z0-9_-]+$/i

希望这可以帮到你

- Kasia Gogolek

0

/^[A-Z0-9_][A-Z0-9_.-]*[A-Z0-9_]$/i

这样可以确保第一个和最后一个字符不是破折号或句点；中间的部分可以由任何字符组成（在您选择的集合内）。

- Ja͢ck

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Walter Tross · Accepted Answer

答案将会是

/^[a-zA-Z0-9_]+([-.][a-zA-Z0-9_]+)*$/

如果您允许包含.-和-.的字符串不匹配，那将更好。反正为什么要允许它们匹配呢？但是，如果您确实需要这些字符串匹配，可能的解决方案是：

/^[a-zA-Z0-9_]+((\.(-\.)*-?|-(\.-)*\.?)[a-zA-Z0-9_]+)*$/

第一个正则表达式中的单个字符 '.' 或 '-' 将被替换为一系列交替出现的 '.' 和 '-'，以 '.' 或 '-' 开始，后面可以选择性地跟上 '-.' 或 '.-' 对，然后再选择性地跟上 '-' 或 '.'，以允许有偶数个交替字符。这种复杂性可能过度了，但似乎是当前规范所需的。如果需要最多 2 个交替出现的 '.' 和 '-'，则正则表达式将变为：

/^[a-zA-Z0-9_]+((\.-?|-\.?)[a-zA-Z0-9_]+)*$/

在这里测试或这里