查找字符串数组的公共前缀

Question

查找字符串数组的公共前缀

17

我有一个如下的数组：

$sports = array(
'Softball - Counties',
'Softball - Eastern',
'Softball - North Harbour',
'Softball - South',
'Softball - Western'
);

我希望能找到字符串的最长公共前缀。在这个例子中，答案应该是'Softball - '

我打算按照以下步骤进行：

$i = 1;

// loop to the length of the first string
while ($i < strlen($sports[0]) {

  // grab the left most part up to i in length
  $match = substr($sports[0], 0, $i);

  // loop through all the values in array, and compare if they match
  foreach ($sports as $sport) {

     if ($match != substr($sport, 0, $i) {
         // didn't match, return the part that did match
         return substr($sport, 0, $i-1);
     }

  } // foreach

   // increase string length
   $i++;
} // while

// if you got to here, then all of them must be identical

问题

有没有内置函数或更简单的方法来完成这个操作？
对于我的五行数组，这可能是可以接受的，但如果我要处理几千行的数组，那么就会有很多开销，所以我必须用我的起始值$i进行更精确的计算，例如$i=字符串的中间位置，如果失败，则使用$i/2，直到成功，然后将$i增加1，直到我们成功为止。这样我们才能用最少的比较次数得到结果。

是否已经存在针对这种问题的公式/算法？

- bumperbox

3

你是在寻找最长公共前缀还是子串呢？比如，如果你有a_abli和a_cable这两个字符串，答案应该是"a_"还是"abl"? - redtuna

如果你有这样的数组，为什么不只需将每个项目拆分并抓取第一个元素？ - SilentGhost

@bumperbox：如果有拼写错误，你打算怎么办？ - SilentGhost

@silentghost，我会容忍拼写错误，如果有问题，他们总是可以回来修正它们。 - bumperbox

@mickmackusa 这些都是真实的数值。我建议用户在想要拆分的位置放一个连字符，但并不是每个人都能做到这一点，因为那是一个自由文本字段。在用户体验方面可能有更好的解决方法，但那时候这些更改超出了项目的范围。 - bumperbox

显示剩余6条评论

18个回答

15

我会使用这个:

$prefix = array_shift($array);  // take the first item as initial prefix
$length = strlen($prefix);
// compare the current prefix with the prefix of the same length of the other items
foreach ($array as $item) {
    // check if there is a match; if not, decrease the prefix by one character at a time
    while ($length && substr($item, 0, $length) !== $prefix) {
        $length--;
        $prefix = substr($prefix, 0, -1);
    }
    if (!$length) {
        break;
    }
}

更新：以下是另一种解决方案，通过逐个比较字符串的每个第n个字符直到找到不匹配的位置：

$pl = 0; // common prefix length
$n = count($array);
$l = strlen($array[0]);
while ($pl < $l) {
    $c = $array[0][$pl];
    for ($i=1; $i<$n; $i++) {
        if ($array[$i][$pl] !== $c) break 2;
    }
    $pl++;
}
$prefix = substr($array[0], 0, $pl);

з”ұдәҺжңҖеӨҡеҸӘйңҖиҰҒиҝӣиЎҢnumberOfStringsвҖҚВ·вҖҚcommonPrefixLengthж¬ЎеҺҹеӯҗжҜ”иҫғпјҢеӣ жӯӨиҝҷж ·еҒҡжӣҙеҠ жңүж•ҲзҺҮгҖӮ

- Gumbo

我怀疑在OP的测试用例中，第二种解决方案效率较低，因为它一次只能构建一个字符的前缀，并在此期间测试所有字符串。因此，它将始终执行numberOfStrings x commonPrefixLength比较。而第一种解决方案可以快速消除大部分前缀，只需比较前两个字符串即可。 - ToolmakerSteve

更好的方法（适用于某些数据）是使用两个循环。第一个循环从零开始，但仅检查前两个字符串，每次检查一个字符。这给出了可能大小的“上限”。然后对剩余字符串执行第一种解决方案，减少前缀。对于所示的测试数据，成本为“N + L”，而不是“N * L”。 - ToolmakerSteve

11

我将@diogoriba算法实现到代码中，得到以下结果：

首先找到前两个字符串的公共前缀，然后将其与从第三个字符串开始的所有后续字符串进行比较，如果没有发现任何相同之处，则删除公共字符串，这种情况适用于前缀中有更多相同的情况。
但是，在字符串的前缀中有更少的相同字符而不是不同字符的情况下，bumperbox的原始算法（除了错误修复）胜出。 详见代码注释！

我还实现了另一个想法：

首先检查数组中最短的字符串，并将其用于比较，而不仅仅是第一个字符串。 在代码中，这是使用自定义编写的函数arrayStrLenMin()实现的。

可以大大减少迭代次数，但函数arrayStrLenMin()本身可能会导致（多或少）迭代。
仅从数组中第一个字符串的长度开始似乎相当笨拙，但如果arrayStrLenMin()需要许多迭代，则可能会变得有效。

以尽可能少的迭代次数获取数组中字符串的最大公共前缀（PHP）

代码+广泛测试+备注：

function arrayStrLenMin ($arr, $strictMode = false, $forLoop = false) {
    $errArrZeroLength = -1; // Return value for error: Array is empty
    $errOtherType = -2;     // Return value for error: Found other type (than string in array)
    $errStrNone = -3;       // Return value for error: No strings found (in array)

    $arrLength = count($arr);
    if ($arrLength <= 0 ) { return $errArrZeroLength; }
    $cur = 0;

    foreach ($arr as $key => $val) {
        if (is_string($val)) {
            $min = strlen($val);
            $strFirstFound = $key;
            // echo("Key\tLength / Notification / Error\n");
            // echo("$key\tFound first string member at key with length: $min!\n");
            break;
        }
        else if ($strictMode) { return $errOtherType; } // At least 1 type other than string was found.
    }
    if (! isset($min)) { return $errStrNone; } // No string was found in array.

    // SpeedRatio of foreach/for is approximately 2/1 as dicussed at:
    // http://juliusbeckmann.de/blog/php-foreach-vs-while-vs-for-the-loop-battle.html

    // If $strFirstFound is found within the first 1/SpeedRatio (=0.5) of the array, "foreach" is faster!

    if (! $forLoop) {
        foreach ($arr as $key => $val) {
            if (is_string($val)) {
                $cur = strlen($val);
                // echo("$key\t$cur\n");
                if ($cur == 0) { return $cur; } // 0 is the shortest possible string, so we can abort here.
                if ($cur < $min) { $min = $cur; }
            }
        // else { echo("$key\tNo string!\n"); }
        }
    }

    // If $strFirstFound is found after the first 1/SpeedRatio (=0.5) of the array, "for" is faster!

    else {
        for ($i = $strFirstFound + 1; $i < $arrLength; $i++) {
            if (is_string($arr[$i])) {
                $cur = strlen($arr[$i]);
                // echo("$i\t$cur\n");
                if ($cur == 0) { return $cur; } // 0 is the shortest possible string, so we can abort here.
                if ($cur < $min) { $min = $cur; }
            }
            // else { echo("$i\tNo string!\n"); }
        }
    }

    return $min;
}

function strCommonPrefixByStr($arr, $strFindShortestFirst = false) {
    $arrLength = count($arr);
    if ($arrLength < 2) { return false; }

    // Determine loop length
    /// Find shortest string in array: Can bring down iterations dramatically, but the function arrayStrLenMin() itself can cause ( more or less) iterations.
    if ($strFindShortestFirst) { $end = arrayStrLenMin($arr, true); }
    /// Simply start with length of first string in array: Seems quite clumsy, but may turn out effective, if arrayStrLenMin() needs many iterations.
    else { $end = strlen($arr[0]); }

    for ($i = 1; $i <= $end + 1; $i++) {
        // Grab the part from 0 up to $i
        $commonStrMax = substr($arr[0], 0, $i);
        echo("Match: $i\t$commonStrMax\n");
        // Loop through all the values in array, and compare if they match
        foreach ($arr as $key => $str) {
            echo("  Str: $key\t$str\n");
            // Didn't match, return the part that did match
            if ($commonStrMax != substr($str, 0, $i)) {
                    return substr($commonStrMax, 0, $i-1);
            }
        }
    }
    // Special case: No mismatch (hence no return) happened until loop end!
    return $commonStrMax; // Thus entire first common string is the common prefix!
}

function strCommonPrefixByChar($arr, $strFindShortestFirst = false) {
    $arrLength = count($arr);
    if ($arrLength < 2) { return false; }

    // Determine loop length
    /// Find shortest string in array: Can bring down iterations dramatically, but the function arrayStrLenMin() itself can cause ( more or less) iterations.
    if ($strFindShortestFirst) { $end = arrayStrLenMin($arr, true); }
    /// Simply start with length of first string in array: Seems quite clumsy, but may turn out effective, if arrayStrLenMin() needs many iterations.
    else { $end = strlen($arr[0]); }

    for ($i = 0 ; $i <= $end + 1; $i++) {
        // Grab char $i
        $char = substr($arr[0], $i, 1);
        echo("Match: $i\t"); echo(str_pad($char, $i+1, " ", STR_PAD_LEFT)); echo("\n");
        // Loop through all the values in array, and compare if they match
        foreach ($arr as $key => $str) {
            echo("  Str: $key\t$str\n");
            // Didn't match, return the part that did match
            if ($char != $str[$i]) { // Same functionality as ($char != substr($str, $i, 1)). Same efficiency?
                    return substr($arr[0], 0, $i);
            }
        }
    }
    // Special case: No mismatch (hence no return) happened until loop end!
    return substr($arr[0], 0, $end); // Thus entire first common string is the common prefix!
}


function strCommonPrefixByNeighbour($arr) {
    $arrLength = count($arr);
    if ($arrLength < 2) { return false; }

    /// Get the common string prefix of the first 2 strings
    $strCommonMax = strCommonPrefixByChar(array($arr[0], $arr[1]));
    if ($strCommonMax === false) { return false; }
    if ($strCommonMax == "") { return ""; }
    $strCommonMaxLength = strlen($strCommonMax);

    /// Now start looping from the 3rd string
    echo("-----\n");
    for ($i = 2; ($i < $arrLength) && ($strCommonMaxLength >= 1); $i++ ) {
        echo("  STR: $i\t{$arr[$i]}\n");

        /// Compare the maximum common string with the next neighbour

        /*
        //// Compare by char: Method unsuitable!

        // Iterate from string end to string beginning
        for ($ii = $strCommonMaxLength - 1; $ii >= 0; $ii--) {
            echo("Match: $ii\t"); echo(str_pad($arr[$i][$ii], $ii+1, " ", STR_PAD_LEFT)); echo("\n");
            // If you find the first mismatch from the end, break.
            if ($arr[$i][$ii] != $strCommonMax[$ii]) {
                $strCommonMaxLength = $ii - 1; break;
                // BUT!!! We may falsely assume that the string from the first mismatch until the begining match! This new string neighbour string is completely "unexplored land", there might be differing chars closer to the beginning. This method is not suitable. Better use string comparison than char comparison.
            }
        }
        */

        //// Compare by string

        for ($ii = $strCommonMaxLength; $ii > 0; $ii--) {
            echo("MATCH: $ii\t$strCommonMax\n");
            if (substr($arr[$i],0,$ii) == $strCommonMax) {
                break;
            }
            else {
                $strCommonMax = substr($strCommonMax,0,$ii - 1);
                $strCommonMaxLength--;
            }
        }
    }
    return substr($arr[0], 0, $strCommonMaxLength);
}





// Tests for finding the common prefix

/// Scenarios

$filesLeastInCommon = array (
"/Vol/1/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/a/1",
"/Vol/2/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/a/2",
"/Vol/1/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/b/1",
"/Vol/1/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/b/2",
"/Vol/2/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/b/c/1",
"/Vol/2/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/a/1",
);

$filesLessInCommon = array (
"/Vol/1/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/a/1",
"/Vol/1/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/a/2",
"/Vol/1/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/b/1",
"/Vol/1/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/b/2",
"/Vol/2/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/b/c/1",
"/Vol/2/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/a/1",
);

$filesMoreInCommon = array (
"/Voluuuuuuuuuuuuuumes/1/a/a/1",
"/Voluuuuuuuuuuuuuumes/1/a/a/2",
"/Voluuuuuuuuuuuuuumes/1/a/b/1",
"/Voluuuuuuuuuuuuuumes/1/a/b/2",
"/Voluuuuuuuuuuuuuumes/2/a/b/c/1",
"/Voluuuuuuuuuuuuuumes/2/a/a/1",
);

$sameDir = array (
"/Volumes/1/a/a/",
"/Volumes/1/a/a/aaaaa/2",
);

$sameFile = array (
"/Volumes/1/a/a/1",
"/Volumes/1/a/a/1",
);

$noCommonPrefix = array (
"/Volumes/1/a/a/",
"/Volumes/1/a/a/aaaaa/2",
"Net/1/a/a/aaaaa/2",
);

$longestLast = array (
"/Volumes/1/a/a/1",
"/Volumes/1/a/a/aaaaa/2",
);

$longestFirst = array (
"/Volumes/1/a/a/aaaaa/1",
"/Volumes/1/a/a/2",
);

$one = array ("/Volumes/1/a/a/aaaaa/1");

$empty = array ( );


// Test Results for finding  the common prefix

/*

I tested my functions in many possible scenarios.
The results, the common prefixes, were always correct in all scenarios!
Just try a function call with your individual array!

Considering iteration efficiency, I also performed tests:

I put echo functions into the functions where iterations occur, and measured the number of CLI line output via:
php <script with strCommonPrefixByStr or strCommonPrefixByChar> | egrep "^  Str:" | wc -l   GIVES TOTAL ITERATION SUM.
php <Script with strCommonPrefixByNeighbour> | egrep "^  Str:" | wc -l   PLUS   | egrep "^MATCH:" | wc -l   GIVES TOTAL ITERATION SUM.

My hypothesis was proven:
strCommonPrefixByChar wins in situations where the strings have less in common in their beginning (=prefix).
strCommonPrefixByNeighbour wins where there is more in common in the prefixes.

*/

// Test Results Table
// Used Functions | Iteration amount | Remarks

// $result = (strCommonPrefixByStr($filesLessInCommon)); // 35
// $result = (strCommonPrefixByChar($filesLessInCommon)); // 35 // Same amount of iterations, but much fewer characters compared because ByChar instead of ByString!
// $result = (strCommonPrefixByNeighbour($filesLessInCommon)); // 88 + 42 = 130 // Loses in this category!

// $result = (strCommonPrefixByStr($filesMoreInCommon)); // 137
// $result = (strCommonPrefixByChar($filesMoreInCommon)); // 137 // Same amount of iterations, but much fewer characters compared because ByChar instead of ByString!
// $result = (strCommonPrefixByNeighbour($filesLeastInCommon)); // 12 + 4 = 16 // Far the winner in this category!

echo("Common prefix of all members:\n");
var_dump($result);





// Tests for finding the shortest string in array

/// Arrays

// $empty = array ();
// $noStrings = array (0,1,2,3.0001,4,false,true,77);
// $stringsOnly = array ("one","two","three","four");
// $mixed = array (0,1,2,3.0001,"four",false,true,"seven", 8888);

/// Scenarios

// I list them from fewest to most iterations, which is not necessarily equivalent to slowest to fastest!
// For speed consider the remarks in the code considering the Speed ratio of foreach/for!

//// Fewest iterations (immediate abort on "Found other type", use "for" loop)

// foreach( array($empty, $noStrings, $stringsOnly, $mixed) as $arr) {
//  echo("NEW ANALYSIS:\n");
//  echo("Result: " . arrayStrLenMin($arr, true, true) . "\n\n");
// }

/* Results:

    NEW ANALYSIS:
    Result: Array is empty!

    NEW ANALYSIS:
    Result: Found other type!

    NEW ANALYSIS:
    Key Length / Notification / Error
    0   Found first string member at key with length: 3!
    1   3
    2   5
    3   4
    Result: 3

    NEW ANALYSIS:
    Result: Found other type!

*/

//// Fewer iterations (immediate abort on "Found other type", use "foreach" loop)

// foreach( array($empty, $noStrings, $stringsOnly, $mixed) as $arr) {
//  echo("NEW ANALYSIS:\n");
//  echo("Result: " . arrayStrLenMin($arr, true, false) . "\n\n");
// }

/* Results:

    NEW ANALYSIS:
    Result: Array is empty!

    NEW ANALYSIS:
    Result: Found other type!

    NEW ANALYSIS:
    Key Length / Notification / Error
    0   Found first string member at key with length: 3!
    0   3
    1   3
    2   5
    3   4
    Result: 3

    NEW ANALYSIS:
    Result: Found other type!

*/

//// More iterations (No immediate abort on "Found other type", use "for" loop)

// foreach( array($empty, $noStrings, $stringsOnly, $mixed) as $arr) {
//  echo("NEW ANALYSIS:\n");
//  echo("Result: " . arrayStrLenMin($arr, false, true) . "\n\n");
// }

/* Results:

    NEW ANALYSIS:
    Result: Array is empty!

    NEW ANALYSIS:
    Result: No strings found!

    NEW ANALYSIS:
    Key Length / Notification / Error
    0   Found first string member at key with length: 3!
    1   3
    2   5
    3   4
    Result: 3

    NEW ANALYSIS:
    Key Length / Notification / Error
    4   Found first string member at key with length: 4!
    5   No string!
    6   No string!
    7   5
    8   No string!
    Result: 4

*/


//// Most iterations (No immediate abort on "Found other type", use "foreach" loop)

// foreach( array($empty, $noStrings, $stringsOnly, $mixed) as $arr) {
//  echo("NEW ANALYSIS:\n");
//  echo("Result: " . arrayStrLenMin($arr, false, false) . "\n\n");
// }

/* Results:

    NEW ANALYSIS:
    Result: Array is empty!

    NEW ANALYSIS:
    Result: No strings found!

    NEW ANALYSIS:
    Key Length / Notification / Error
    0   Found first string member at key with length: 3!
    0   3
    1   3
    2   5
    3   4
    Result: 3

    NEW ANALYSIS:
    Key Length / Notification / Error
    4   Found first string member at key with length: 4!
    0   No string!
    1   No string!
    2   No string!
    3   No string!
    4   4
    5   No string!
    6   No string!
    7   5
    8   No string!
    Result: 4

*/

- porg

Gumbo的答案更简单。你能证明这个答案显著更快吗？ - ToolmakerSteve

2

请参考Gustav Bertram的最新答案（目前在下方），该答案通过首先排序字符串列表来避免所有这些问题。这样做的好处是，您可以利用已经存在的优化排序方法。因此，您可以轻松实现一个简单的解决方案，而无需花费精力进行优化！ - ToolmakerSteve

7

也许有一些非常受欢迎的算法可以解决这个问题，但是就我个人而言，如果您知道您的共同点将在左侧（与您的示例相似），您可以通过首先找到前两个字符串的共同点，然后向下迭代其余列表，根据需要修剪共同字符串以实现共同性或者如果修剪到没有内容则终止失败，从而比您发布的方法更好。

- chaos

7

我认为你走在正确的道路上。但是，在所有字符串都通过时递增i，你可以这样做：

1）比较数组中的前两个字符串，并找出它们有多少个共同字符。将共同字符保存在一个名为maxCommon的单独字符串中。

2）将第三个字符串与maxCommon进行比较。如果共同字符的数量较小，则将maxCommon修剪为匹配的字符。

3）重复并清洗数组的其余部分。在此过程结束时，maxCommon将具有所有数组元素共同的字符串。

这将增加一些开销，因为您需要将每个字符串与maxCommon进行比较，但会大大减少您需要获得结果的迭代次数。

- diogoriba

+1 这正是我会做的方式。另一个优点是：一旦你遇到一个以不同首字符开头的字符串，你就可以停止迭代遍历剩余的字符串，因为没有公共前缀。 - Andreas F

3

我假设你所说的“common part”是指“最长公共前缀”。这比任何公共子串都要简单得多。

在最坏情况下，无法完成此操作而不读取(n+1)*m个字符，在最好的情况下，需要读取n*m+1个字符，其中n是最长公共前缀的长度，m是字符串数量。

逐个字母比较可以实现这种效率（Big Theta(n*m)）。

您提出的算法运行时间为Big Theta(n^2 * m)，对于大输入来说要慢得多。

第三种提议的算法是找到前两个字符串的最长前缀，然后将其与第三个、第四个等进行比较，也具有Big Theta(n*m)的运行时间，但常数因子更高。在实践中，它可能只会稍微慢一些。

总的来说，我建议自己编写函数，因为第一个算法太慢了，而另外两个算法的复杂程度大致相同。

请查看WikiPedia了解Big Theta符号的说明。

- Jørgen Fogh

2

以下是JavaScript中优雅的递归实现：

function prefix(strings) {
    switch (strings.length) {

      case 0:
        return "";

      case 1:
        return strings[0];

      case 2:
        // compute the prefix between the two strings
        var a = strings[0],
            b = strings[1],
            n = Math.min(a.length, b.length),
            i = 0;
        while (i < n && a.charAt(i) === b.charAt(i))
            ++i;
        return a.substring(0, i);

      default:
        // return the common prefix of the first string,
        // and the common prefix of the rest of the strings
        return prefix([ strings[0], prefix(strings.slice(1)) ]);
    }
}

- mishoo

1

简短而简洁的版本，也许不是最有效的：

/// Return length of longest common prefix in an array of strings.
function _commonPrefix($array) {
    if(count($array) < 2) {
        if(count($array) == 0)
            return false; // empty array: undefined prefix
        else
            return strlen($array[0]); // 1 element: trivial case
    }
    $len = max(array_map('strlen',$array)); // initial upper limit: max length of all strings.
    $prevval = reset($array);
    while(($newval = next($array)) !== FALSE) {
        for($j = 0 ; $j < $len ; $j += 1)
            if($newval[$j] != $prevval[$j])
                $len = $j;
        $prevval = $newval;
    }
    return $len;
}

// TEST CASE:
$arr = array('/var/yam/yamyam/','/var/yam/bloorg','/var/yar/sdoo');
print_r($arr);
$plen = _commonprefix($arr);
$pstr = substr($arr[0],0,$plen);
echo "Res: $plen\n";
echo "==> ".$pstr."\n";
echo "dir: ".dirname($pstr.'aaaa')."\n";

测试用例的输出：

Array
(
    [0] => /var/yam/yamyam/
    [1] => /var/yam/bloorg
    [2] => /var/yar/sdoo
)
Res: 7
==> /var/ya
dir: /var

- nanite

1

我不知道有没有这样的事情
是的：与其比较从0到长度i的子字符串，你可以直接检查第i个字符（因为你已经知道字符0到i-1是匹配的）。

- redtuna

0


    // Common prefix
    $common = '';

    $sports = array(
    'Softball T - Counties',
    'Softball T - Eastern',
    'Softball T - North Harbour',
    'Softball T - South',
    'Softball T - Western'
    );

    // find mini string
    $minLen = strlen($sports[0]);
    foreach ($sports as $s){
        if($minLen > strlen($s))
            $minLen = strlen($s);
    }


    // flag to break out of inner loop
    $flag = false;

    // The possible common string length does not exceed the minimum string length.
    // The following solution is O(n^2), this can be improve.
    for ($i = 0 ; $i < $minLen; $i++){
        $tmp = $sports[0][$i];

        foreach ($sports as $s){
            if($s[$i] != $tmp)
                $flag = true;
        }
        if($flag)
            break;
        else
            $common .= $sports[0][$i];
    }

    print $common;

- Ibrahim

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Gustav Bertram · Accepted Answer

如果您可以对数组进行排序，则有一个简单而非常快速的解决方案。

只需比较第一项和最后一项。

如果字符串已排序，则所有字符串共同前缀将与排序后的第一个和最后一个字符串相同。

sort($sport);

$s1 = $sport[0];               // First string
$s2 = $sport[count($sport)-1]; // Last string
$len = min(strlen($s1), strlen($s2));

// While we still have string to compare,
// if the indexed character is the same in both strings,
// increment the index. 
for ($i=0; $i<$len && $s1[$i]==$s2[$i]; $i++); 

$prefix = substr($s1, 0, $i);