对包含一个或多个姓氏的字符串进行标题格式化,同时处理带有撇号的姓名

47

我想要标准化一个用户提供的字符串。我希望对于名字来说,第一个字母要大写,并且如果他们输入了两个姓氏,那么将第一个和第二个名称都大写。例如,如果有人输入:

marriedname maidenname

如果有超过两个名字,它会将其转换为 Marriedname Maidenname 等。

另一种情况是当某人的名字中带有撇号。 如果有人输入:

o'connell

需要将此转换为 O'Connell

我使用的是:

ucfirst(strtolower($last_name));

然而,正如你所了解的,这并不适用于所有情况。


1
@deceze 哈哈,说得好。我想我只能希望他们打得正确 :) - user1048676
即使它们这样做,你也要将其转换为小写。 - Kai Qing
9
您认为让用户以他们想要存储的方式输入他们的姓名如何? - Jeff Roe
дёҖдёӘз®ҖеҚ•зҡ„ $name == strtolower($name) жЈҖжҹҘеҸҜд»Ҙз”ЁжқҘеҗҜз”Ё/зҰҒз”Ёжӣҙж”№еӨ§е°ҸеҶҷзҡ„з®—жі•гҖӮ - Abhi Beckert
相关 - 一个更一般化的问题:https://dev59.com/U4_ea4cB1Zd3GeqPJhaa - mickmackusa
12个回答

45

这将大写所有单词的首字母和紧随撇号后面的字母。它会将所有其他字母变成小写。它应该适用于您:

str_replace('\' ', '\'', ucwords(str_replace('\'', '\' ', strtolower($last_name))));

1
我认为这不支持UTF8,因此我建议尝试Antonio Max的答案。 - Liam
2012年时ucwords()函数是否支持多个分隔符?如果您还需要破折号:"Smith-Jones" - Misunderstood

29

你可以尝试这个来处理单词

<?php echo ucwords(strtolower('Dhaka, JAMALPUR, sarishabari')) ?>

结果是:Dhaka,Jamalpur,Sarishabari


你所链接的函数在这个答案中没有被使用。 - mickmackusa
这并没有回答原帖的问题。你忘记加撇号了。使用分隔符。 - Misunderstood

24

所有这些都不支持UTF8,所以我提供一个完美运行的(到目前为止)

function titleCase($string, $delimiters = array(" ", "-", ".", "'", "O'", "Mc"), $exceptions = array("and", "to", "of", "das", "dos", "I", "II", "III", "IV", "V", "VI"))
{
    /*
     * Exceptions in lower case are words you don't want converted
     * Exceptions all in upper case are any words you don't want converted to title case
     *   but should be converted to upper case, e.g.:
     *   king henry viii or king henry Viii should be King Henry VIII
     */
    $string = mb_convert_case($string, MB_CASE_TITLE, "UTF-8");
    foreach ($delimiters as $dlnr => $delimiter) {
        $words = explode($delimiter, $string);
        $newwords = array();
        foreach ($words as $wordnr => $word) {
            if (in_array(mb_strtoupper($word, "UTF-8"), $exceptions)) {
                // check exceptions list for any words that should be in upper case
                $word = mb_strtoupper($word, "UTF-8");
            } elseif (in_array(mb_strtolower($word, "UTF-8"), $exceptions)) {
                // check exceptions list for any words that should be in upper case
                $word = mb_strtolower($word, "UTF-8");
            } elseif (!in_array($word, $exceptions)) {
                // convert to uppercase (non-utf8 only)
                $word = ucfirst($word);
            }
            array_push($newwords, $word);
        }
        $string = join($delimiter, $newwords);
   }//foreach
   return $string;
}

使用方法:

$s = 'SÃO JOÃO DOS SANTOS';
$v = titleCase($s); // 'São João dos Santos' 

4

使用这个内置函数:

ucwords('string');

1
如果单词是 stRinG,它将变成 StRinG。不会全部转换为小写。 - emotality
2
所以 ucwords(strtolower('stRing')) - Mateus Viccari
1
此答案不处理像 o'connor 这样的异常情况。 - mickmackusa

2

我不相信有一个好的答案可以涵盖所有情况。PHP.net关于ucwords的论坛有很多讨论,但似乎没有一个答案适用于所有情况。我建议您要么使用大写字母,要么保留用户的输入。


1

这对大多数英文名字都适用。
但不适用于罗马数字后缀。
它也不适用于sÃO JoÃO dos SaNTOS III

你有带空格的名字。big john
你有带撇号的名字。O'dell
你有带连字符的名字。Smith-jones
你有大小写错误的名字。sMith-joNes
你有全大写的名字。JOHN SMITH
你有各种组合。

例如:big JohN o'dell-sMIth

只需一行简单的代码即可处理所有情况。

$name = ucWords(strtolower($name)," -'");

大约翰·奥德尔-史密斯

.


这应该是最高评价的答案。非常棒! - PilotSnipes

1

这是我过于复杂,但相当全面的PHP拉丁名称大写解决方案。它将解决您所有的大小写问题。全部都可以。

/**
 * Over-engineered solution to most capitalisation issues.
 * 
 * @author https://stackoverflow.com/users/429071/dearsina
 * @version 1.0
 */ 
class str {
    /**
     * Words or abbreviations that should always be all uppercase
     */
    const ALL_UPPERCASE = [
        "UK",
        "VAT",
    ];

    /**
     * Words or abbreviations that should always be all lowercase
     */
    const ALL_LOWERCASE = [
        "and",
        "as",
        "by",
        "in",
        "of",
        "or",
        "to",
    ];

    /**
     * Honorifics that only contain consonants.
     *
     */
    const CONSONANT_ONLY_HONORIFICS = [
        # English
        "Mr",
        "Mrs",
        "Ms",
        "Dr",
        "Br",
        "Sr",
        "Fr",
        "Pr",
        "St",

        # Afrikaans
        "Mnr",
    ];

    /**
     * Surname prefixes that should be lowercase,
     * unless not following another word (firstname).
     */
    const SURNAME_PREFIXES = [
        "de la",
        "de las",
        "van de",
        "van der",
        "vit de",
        "von",
        "van",
        "del",
        "der",
    ];

    /**
     * Capitalises every (appropriate) word in a given string.
     *
     * @param string|null $string
     *
     * @return string|null
     */
    public static function capitalise(?string $string): ?string
    {
        if(!$string){
            return $string;
        }

        # Strip away multi-spaces
        $string = preg_replace("/\s{2,}/", " ", $string);

        # Ensure there is always a space after a comma
        $string = preg_replace("/,([^\s])/", ", $1", $string);

        # A word is anything separated by spaces or a dash
        $string = preg_replace_callback("/([^\s\-\.]+)/", function($matches){
            # Make the word lowercase
            $word = mb_strtolower($matches[1]);

            # If the word needs to be all lowercase
            if(in_array($word, self::ALL_LOWERCASE)){
                return strtolower($word);
            }

            # If the word needs to be all uppercase
            if(in_array(mb_strtoupper($word), self::ALL_UPPERCASE)){
                return strtoupper($word);
            }

            # Create a version without diacritics
            $transliterator = \Transliterator::createFromRules(':: Any-Latin; :: Latin-ASCII; :: NFD; :: [:Nonspacing Mark:] Remove; :: Lower(); :: NFC;', \Transliterator::FORWARD);
            $ascii_word = $transliterator->transliterate($word);


            # If the word contains non-alpha characters (numbers, &, etc), with exceptions (comma, '), assume it's an abbreviation
            if(preg_match("/[^a-z,']/i", $ascii_word)){
                return strtoupper($word);
            }

            # If the word doesn't contain any vowels, assume it's an abbreviation
            if(!preg_match("/[aeiouy]/i", $ascii_word)){
                # Unless the word is an honorific
                if(!in_array(ucfirst($word), self::CONSONANT_ONLY_HONORIFICS)){
                    return strtoupper($word);
                }
            }

            # If the word contains two of the same vowel and is 3 characters or fewer, assume it's an abbreviation
            if(strlen($word) <= 3 && preg_match("/([aeiouy])\1/", $word)){
                return strtoupper($word);
            }

            # Ensure O'Connor, L'Oreal, etc, are double capitalised, with exceptions (d')
            if(preg_match("/\b([a-z]')(\w+)\b/i", $word, $match)){
                # Some prefixes (like d') are not capitalised
                if(in_array($match[1], ["d'"])){
                    return $match[1] . ucfirst($match[2]);
                }

                # Otherwise, everything is capitalised
                return strtoupper($match[1]) . ucfirst($match[2]);
            }

            # Otherwise, return the word with the first letter (only) capitalised
            return ucfirst($word);
            //The most common outcome
        }, $string);

        # Cater for the Mc prefix
        $pattern = "/(Mc)([b-df-hj-np-tv-z])/";
        //Mc followed by a consonant
        $string = preg_replace_callback($pattern, function($matches){
            return "Mc" . ucfirst($matches[2]);
        }, $string);

        # Cater for Roman numerals (need to be in all caps)
        $pattern = "/\b((?<![MDCLXVI])(?=[MDCLXVI])M{0,3}(?:C[MD]|D?C{0,3})(?:X[CL]|L?X{0,3})(?:I[XV]|V?I{0,3}))\b/i";
        $string = preg_replace_callback($pattern, function($matches){
            return strtoupper($matches[1]);
        }, $string);

        # Cater for surname prefixes (must be after the Roman numerals)
        $pattern = "/\b (".implode("|", self::SURNAME_PREFIXES).") \b/i";
        //A surname prefix, bookended by words
        $string = preg_replace_callback($pattern, function($matches){
            return strtolower(" {$matches[1]} ");
        }, $string);

        # Cater for ordinal numbers
        $pattern = "/\b(\d+(?:st|nd|rd|th))\b/i";
        //A number suffixed with an ordinal
        $string = preg_replace_callback($pattern, function($matches){
            return strtolower($matches[1]);
        }, $string);

        # And we're done done
        return $string;
    }
}

来玩一下这个


1

您可以使用带有e标志(执行PHP函数)的preg_replace

function processReplacement($one, $two)
{
  return $one . strtoupper($two);
}

$name = "bob o'conner";
$name = preg_replace("/(^|[^a-zA-Z])([a-z])/e","processReplacement('$1', '$2')", $name);

var_dump($name); // output "Bob O'Conner"

或许正则表达式模式可以改进,但我所做的是:

  • $1 要么是行首,要么是任何非字母字符。
  • $2 是任何小写字母字符。

然后我们用简单的 processReplacement() 函数的结果替换它们两个。

如果你使用的是 PHP 5.3,将 processReplacement() 设为匿名函数可能会更好。


e不再起作用了。请[编辑]此答案。 - mickmackusa

1
这是一个更简单、更直接回答主要问题的函数。以下函数模仿了PHP的处理方式。以防未来PHP扩展其名称空间,首先进行了测试。我在我的WordPress安装中使用此方法,它适用于任何语言。
$str = mb_ucfirst($str, 'UTF-8', true);

这将使第一个字母大写,其余字母小写,就像 Q 一样。如果第三个参数设置为 false(默认值),则不会操作字符串的剩余部分。
// Extends PHP
if (!function_exists('mb_ucfirst')) {

function mb_ucfirst($str, $encoding = "UTF-8", $lower_str_end = false) {
    $first_letter = mb_strtoupper(mb_substr($str, 0, 1, $encoding), $encoding);
    $str_end = "";
    if ($lower_str_end) {
        $str_end = mb_strtolower(mb_substr($str, 1, mb_strlen($str, $encoding), $encoding), $encoding);
    } else {
        $str_end = mb_substr($str, 1, mb_strlen($str, $encoding), $encoding);
    }
    $str = $first_letter . $str_end;
    return $str;
}

}

1
我可能漏掉了什么,但这似乎只会将字符串的第一个字母大写?原始问题是如何使字符串中包含的所有单词的首字母大写。 - DB5

0
我使用这个:
    <?php
// Let's create a function, so we can reuse the logic
    function sentence_case($str){
        // Let's split our string into an array of words
        $words = explode(' ', $str);
        foreach($words as &$word){
            // Let's check if the word is uppercase; if so, ignore it
            if($word == mb_convert_case($word, MB_CASE_UPPER, "UTF-8")){
                continue;
            }
            // Otherwise, let's make the first character uppercase
           $word = mb_convert_case($word, MB_CASE_TITLE , "UTF-8");
        }
        // Join the individual words back into a string
        return implode(' ', $words);
    }
        //echo sentence_case('tuyển nhân o'canel XV-YZ xp-hg iphone-plus viên bán hàng trên sàn MTĐT');
// "Tuyển Nhân O'Canel XV-YZ Xp-Hg Iphone-Plus Viên Bán Hàng Trên Sàn MTĐT"

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接