我有一个简单的评论系统,人们可以在纯文本字段中提交超链接。当我从数据库中将这些记录显示回网页时,我应该使用PHP中的什么RegExp来将这些链接转换为HTML类型的锚链接呢?
我不希望算法对任何其他类型的链接执行此操作,只想针对http和https。
我有一个简单的评论系统,人们可以在纯文本字段中提交超链接。当我从数据库中将这些记录显示回网页时,我应该使用PHP中的什么RegExp来将这些链接转换为HTML类型的锚链接呢?
我不希望算法对任何其他类型的链接执行此操作,只想针对http和https。
// $html holds the string
$htmlunlinkeds = array_reverse(preg_split('|<[Aa]\s+[^>]+>.*</[Aa]\s*>|', $html, -1, PREG_SPLIT_OFFSET_CAPTURE)); // start from end so we substitute correctly
foreach ($htmlunlinkeds as $htmlunlinked)
{ // and that we don't detect links inside HTML, e.g. <img src="http://...">
$thishtmluntaggeds = array_reverse(preg_split('/<[^>]*>/', $htmlunlinked[0], -1, PREG_SPLIT_OFFSET_CAPTURE)); // again, start from end
foreach ($thishtmluntaggeds as $thishtmluntagged)
{
$innerhtml = $thishtmluntagged[0];
if(is_numeric(strpos($innerhtml, '://')))
{ // quick test first
$newhtml = qa_html_convert_urls($innerhtml, qa_opt('links_in_new_window'));
$html = substr_replace($html, $newhtml, $htmlunlinked[1]+$thishtmluntagged[1], strlen($innerhtml));
}
}
}
echo $html;
function qa_html_convert_urls($html, $newwindow = false)
/*
Return $html with any URLs converted into links (with nofollow and in a new window if $newwindow).
Closing parentheses/brackets are removed from the link if they don't have a matching opening one. This avoids creating
incorrect URLs from (http://www.question2answer.org) but allow URLs such as http://www.wikipedia.org/Computers_(Software)
*/
{
$uc = 'a-z\x{00a1}-\x{ffff}';
$url_regex = '#\b((?:https?|ftp)://(?:[0-9'.$uc.'][0-9'.$uc.'-]*\.)+['.$uc.']{2,}(?::\d{2,5})?(?:/(?:[^\s<>]*[^\s<>\.])?)?)#iu';
// get matches and their positions
if (preg_match_all($url_regex, $html, $matches, PREG_OFFSET_CAPTURE)) {
$brackets = array(
')' => '(',
'}' => '{',
']' => '[',
);
// loop backwards so we substitute correctly
for ($i = count($matches[1])-1; $i >= 0; $i--) {
$match = $matches[1][$i];
$text_url = $match[0];
$removed = '';
$lastch = substr($text_url, -1);
// exclude bracket from link if no matching bracket
while (array_key_exists($lastch, $brackets)) {
$open_char = $brackets[$lastch];
$num_open = substr_count($text_url, $open_char);
$num_close = substr_count($text_url, $lastch);
if ($num_close == $num_open + 1) {
$text_url = substr($text_url, 0, -1);
$removed = $lastch . $removed;
$lastch = substr($text_url, -1);
}
else
break;
}
$target = $newwindow ? ' target="_blank"' : '';
$replace = '<a href="' . $text_url . '" rel="nofollow"' . $target . '>' . $text_url . '</a>' . $removed;
$html = substr_replace($html, $replace, $match[1], strlen($match[0]));
}
}
return $html;
}
我非常喜欢这个答案 - 但是我需要一种解决可能存在于非常简单的HTML文本中的纯文本链接的方法:
<p>I found a really cool site you might like:</p>
<p>www.stackoverflow.com</p>
<
和>
所以我改变了部分模式,使用了[^\s\>\<]
代替\S
\S
- 非空白字符;匹配任何非空白字符(制表符、空格、换行符)[^]
- 否定集;匹配不在集合中的任何字符我需要另一种格式除了HTML,所以我将正则表达式与它们的替换分开,以适应此需求。
我还添加了一种方法来将找到的链接/电子邮件仅返回为数组,以便我可以将它们保存为我的帖子上的关系(非常适合稍后制作元卡和分析!)。
我发现像there...it
这样的文本也被匹配了 - 所以我想确保我不会得到包含连续点的任何匹配。
/***
* based on this answer: https://dev59.com/PnI-5IYBdhLWcg3wO1rl#49689245
*
* @var $text String
* @var $format String - html (<a href=""...), short ([link:https://somewhere]), other (https://somewhere)
*/
public function formatLinksInString(
$string,
$format = 'html',
$returnMatches = false
) {
$formatProtocol = $format == 'html'
? '<a href="$0" target="_blank" title="$0">$0</a>'
: ($format == 'short' || $returnMatches ? '[link:$0]' : '$0');
$formatSansProtocol = $format == 'html'
? '<a href="//$0" target="_blank" title="$0">$0</a>'
: ($format == 'short' || $returnMatches ? '[link://$0]' : '$0');
$formatMailto = $format == 'html'
? '<a href="mailto:$1" target="_blank" title="$1">$1</a>'
: ($format == 'short' || $returnMatches ? '[mailto:$1]' : '$1');
$regProtocol = '/(http|https|ftp|ftps)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,}(\/[^\<\>\s]*)?/';
$regSansProtocol = '/(?<=\s|\A|\>)([0-9a-zA-Z\-\.]+\.[a-zA-Z0-9\/]{2,})(?=\s|$|\,|\<)/';
$regEmail = '/([^\s\>\<]+\@[^\s\>\<]+\.[^\s\>\<]+)\b/';
$consecutiveDotsRegex = $format == 'html'
? '/<a[^\>]+[\.]{2,}[^\>]*?>([^\<]*?)<\/a>/'
: '/\[link:.*?\/\/([^\]]+[\.]{2,}[^\]]*?)\]/';
// Protocol links
$formatString = preg_replace($regProtocol, $formatProtocol, $string);
// Sans Protocol Links
$formatString = preg_replace($regSansProtocol, $formatSansProtocol, $formatString); // use formatString from above
// Email - Mailto - Links
$formatString = preg_replace($regEmail, $formatMailto, $formatString); // use formatString from above
// Prevent consecutive periods from getting captured
$formatString = preg_replace($consecutiveDotsRegex, '$1', $formatString);
if ($returnMatches) {
// Find all [x:link] patterns
preg_match_all('/\[.*?:(.*?)\]/', $formatString, $matches);
current($matches); // to move pointer onto groups
return next($matches); // return the groups
}
return $formatString;
}
$string = 'example.com
www.example.com
http://example.com
https://example.com
http://www.example.com
https://www.example.com';
preg_match_all('#(\w*://|www\.)[a-z0-9]+(-+[a-z0-9]+)*(\.[a-z0-9]+(-+[a-z0-9]+)*)+(/([^\s()<>;]+\w)?/?)?#i', $string, $matches, PREG_OFFSET_CAPTURE | PREG_SET_ORDER);
foreach (array_reverse($matches) as $match) {
$a = '<a href="'.(strpos($match[1][0], '/') ? '' : 'http://') . $match[0][0].'">' . $match[0][0] . '</a>';
$string = substr_replace($string, $a, $match[0][1], strlen($match[0][0]));
}
echo $string;
结果:
example.com
<a href="http://www.example.com">www.example.com</a>
<a href="http://example.com">http://example.com</a>
<a href="https://example.com">https://example.com</a>
<a href="http://www.example.com">http://www.example.com</a>
<a href="https://www.example.com">https://www.example.com</a>
www.example.com
转换为http://www.example.com
,因为<a href="www.example.com"></a>
不起作用(没有http/https
协议,它会指向yourdomain.com/www.example.com
)。如果我理解正确,您想要将普通文本转换为http链接。以下是我认为可以帮助您的内容:
<?php
$list = mysqli_query($con,"SELECT * FROM list WHERE name = 'table content'");
while($row2 = mysqli_fetch_array($list)) {
echo "<a target='_blank' href='http://www." . $row2['content']. "'>" . $row2['content']. "</a>";
}
?>