将多个正则表达式合并成一个

7

我正尝试编写一段代码,将字符串分隔成拉丁诗句。虽然我已经注意到了一些限制条件,但是我并没有得到期望的输出结果。下面是我的代码:

<?php

$string = "impulerittantaenanimis caelestibusirae";

$precedingC = precedingConsonant($string);
$xrule = xRule($precedingC);
$consonantc = consonantCT($xrule);
$consonantp = consonantPT($consonantc);
$cbv = CbetweenVowels($consonantp);
$tv = twoVowels($cbv);

echo $tv;

function twoVowels($string)
{
    return preg_replace('/([aeiou])([aeiou])/', '$1-$2', $string);
}
function CbetweenVowels($string)
{
    return preg_replace('/([aeiou])([^aeiou])([aeiou])/', '$1-$2$3', $string);
}
function consonantPT($string)
{
    return preg_replace('/([^aeiou]p)(t[aeiou])/', '$1-$2', $string);
}
function consonantCT($string)
{
    return preg_replace('/([^aeiou]c)(t[aeiou])/', '$1-$2', $string);
}
function precedingConsonant($string)
{
    $arr1 = str_split($string);
    $length = count($arr1);
    for($j=0;$j<$length;$j++)
    {
        if(isVowel($arr1[$j]) && !isVowel($arr1[$j+1]) && !isVowel($arr1[$j+2]) && isVowel($arr1[$j+3]))
        {
            $pc++;  
        }
    }

    function strAppend2($string)
    {
        $arr1 = str_split($string);
        $length = count($arr1);


        for($i=0;$i<$length;$i++)
        {
            $check = $arr1[$i+1].$arr1[$i+2];
            $check2 = $arr1[$i+1].$arr1[$i+2].$arr1[$i+3];
            if($check=='br' || $check=='cr' || $check=='dr' || $check=='fr' || $check=='gr' || $check=='pr' || $check=='tr' || $check=='bl' || $check=='cl' || $check=='fl' || $check=='gl' || $check=='pl' || $check=='ch' || $check=='ph' || $check=='th' || $check=='qu' || $check2=='phl' || $check2=='phr')
            {
                if(isVowel($arr1[$i]) && !isVowel($arr1[$i+1]) && !isVowel($arr1[$i+2]) && isVowel($arr1[$i+3]))
                {
                    $updatedString = substr_replace($string, "-", $i+1, 0);
                    return $updatedString;
                }
            }
            else
            {
                if(isVowel($arr1[$i]) && !isVowel($arr1[$i+1]) && !isVowel($arr1[$i+2]) && isVowel($arr1[$i+3]))
                {
                    $updatedString = substr_replace($string, "-", $i+2, 0);
                    return $updatedString;
                }
            }
        }
    }
    $st1 = $string;
    for($k=0;$k<$pc;$k++)
    {
        $st1 = strAppend2($st1);
    }

    return $st1;
}
function xRule($string)
{
    return preg_replace('/([aeiou]x)([aeiou])/', '$1-$2', $string);
}
function isVowel($ch)
{
    if($ch=='a' || $ch=='e' || $ch=='i' || $ch=='o' || $ch=='u')
    {
        return true;
    }
    else
    {
        return false;
    }
}
function isConsonant($ch)
{
    if($ch=='a' || $ch=='e' || $ch=='i' || $ch=='o' || $ch=='u')
    {
        return false;
    }
    else
    {
        return true;
    }
}

?>

我相信如果将所有这些功能结合起来,就能得到期望的输出结果。但是我会在下面具体说明我的限制:

Rule 1 : When two or more consonants are between vowels, the first consonant is joined to the preceding vowel; for example - rec-tor, trac-tor, ac-tor, delec-tus, dic-tator, defec-tus, vic-tima, Oc-tober, fac-tum, pac-tus, 

Rule 2 : 'x' is joined to the preceding vowel; as, rex-i. 

However we give a special exception to the following consonants - br, cr, dr, fr, gr, pr, tr; bl, cl, fl, gl, pl, phl, phr, ch, ph, th, qu. These consonants are taken care by adding them to the later vowel for example - con- sola-trix
n- sola-trix. 

Rule 3 : When 'ct' follows a consonant, that consonant and 'c' are both joined to the first vowel for example - sanc-tus and junc-tum

Similarly for 'pt' we apply the same rule for example - scalp-tum, serp-tum, Redemp-tor. 

Rule 4 : A single consonant between two vowels is joined to the following vowel for example - ma-ter, pa-ter AND Z is joined to the following vowel. 

Rule 5 : When two vowels come together they are divided, if they be not a diphthong; as au-re-us. Diaphthongs are - "ae","oe","au"

1
你确定西塞罗会同意吗? - Casimir et Hippolyte
在考虑组合多个正则表达式或函数之前,你必须首先确定每个函数是否执行其预期的操作。例如,函数 twoVowels 没有考虑到双元音。 - Casimir et Hippolyte
strAppend2 应该做什么?您应该注释您的代码并为每个函数添加简短的描述(查找有关 phpDoc 的信息)。 - Casimir et Hippolyte
1个回答

3

仔细观察每一条规则,你会发现它们都涉及以元音字母开头或前面有元音字母的情况。一旦你意识到这一点,就可以尝试构建一个单一模式,在开头放置[aeiou]

$pattern = '~
    (?<=[aeiou]) # each rule involves a vowel at the beginning (also called a
                 # "preceding vowel")
    (?:
        # Rule 2: capture particular cases
        ( (?:[bcdfgpt]r | [bcfgp] l | ph [lr] | [cpt] h | qu ) [aeiou] x )
      |
        [bcdfghlmnp-tx]
        (?:
            # Rule 3: When "ct" follows a consonant, that consonant and "c" are both
            # joined to the first vowel
            [cp] \K (?=t)
          |
            # Rule 1: When two or more consonants are between vowels, the first
            # consonant is joined to the preceding vowel
            \K (?= [bcdfghlmnp-tx]+ [aeiou] )
        )   
      |
        # Rule 4: a single consonant between two vowels is joined to the following
        # vowel
        (?:
            \K (?= [bcdfghlmnp-t] [aeiou] )
          | 
            # Rule 2: "x" is joined to the preceding vowel
            x \K (?= [a-z] | (*SKIP)(*F) ) 
        )
      |
        # Rule 5: When two vowels come together they are divided, if they not be a
        # diphthong ("ae", "oe", "au")
        \K (?= [aeiou] (?<! a[eu] | oe ) )
    )
~xi';

这种模式的设计是为了仅匹配放置连字符的位置(除了规则2的特殊情况),因此它使用了大量的\K来从此位置开始匹配结果,并使用前瞻来测试接下来的内容而不匹配字符。

$string = <<<EOD
Aeneadum genetrix, hominum diuomque uoluptas,
alma Uenus, caeli subter labentia signa
quae mare nauigerum, quae terras frugiferentis
concelebras, per te quoniam genus omne animantum
EOD;

$result = preg_replace($pattern, '-$1', $string);

Aeneadum genetrix, hominum divomque voluptas,
alma Venus, caeli subter labentia signa
quae mare navigerum, quae terras frugiferentis
concelebras, per te quoniam genus omne animantium

请注意,我没有包括几个字母,如k、y和z,在拉丁字母表中不存在,如果需要处理翻译的希腊单词或其他单词,请随意包含它们。

这段文字是一首关于罗马神话中爱与美神维纳斯的诗歌。以下是通俗易懂的翻译:

维纳斯,人类和众神的母亲,欢愉之源,
你在天空下流动的星座之下,
歌颂着航海者的大海,歌颂着肥沃的土地,
因为所有生灵都是因你而生。


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接