按照特定单词分割/解析PHP字符串

4
我已经查阅了PHP手册、Stackoverflow和一些论坛,但是对于一些PHP逻辑我感到困惑。也许我只是累了,但我真的很感激任何人对此提供帮助或指导。
我有一个PHP字符串,比如说:
 $string = 'cats cat1 cat2 cat3 dogs dog1 dog2 monkey creatures monkey_creature1 monkey_creature2 monkey_creature3';

最终,我希望我的最终输出看起来像这样,但现在只要得到数组就足够了。
 <h2>cats</h2>
 <ul>
     <li>cat1</li>
     <li>cat2</li>
     <li>cat3</li>
 </ul>

 <h2>dogs</h2>
 <ul>
     <li>dog1</li>
     <li>dog2</li>
 </ul>

 <h2>monkey creatures</h2>
 <ul>
     <li>monkey_creature1</li>
     <li>monkey_creature2</li>
     <li>monkey_creature3</li>
 </ul>

然而,有一个问题,有时字符串会略有不同:

 $string = 'cats cat1 cat2 cat3 cat4 cat5 cats6 dogs dogs1 dogs2 monkey creatures monkey_creature1 lemurs lemur1 lemur2 lemur3';

无论如何,这是我在Stackoverflow上的第一个问题,提前感谢大家的帮助!
编辑:我在一些限制下工作,不能改变字符串之前的任何代码。我预先知道所有的父元素('cats','dogs','lemurs','monkey creatures (with space)')。

2
哦哦哦。这是个非常好的问题。我认为它会以explode ("", $string)开始,但这只是几个步骤中的第一步。像前三个字母总是独特于一组这样的肤浅假设是否安全? - Ben Roux
“猴子生物”并没有像它应该的那样与它的孩子们有关系。另外,为什么你的数据源格式是这样的? - jprofitt
2
在示例字符串中是 monkey creatures 还是 monkey_creatures?这很重要。 - galymzhan
@galymzhan 父级使用 'monkey creatures',而不是 'monkey_creatures'。 - envysea
@envysea 所以每当我们看到“creatures”这个词时,它前面的那个词就是它的一部分? - Sampson
显示剩余3条评论
5个回答

4
我设计了一个答案,无论“关键词”之间是否有空格,只要第一个关键词不是复数形式,它就能工作 :)
以下是代码,请随意查看,您可以用文本做出美丽的事情 :)
<?
$string = 'cats cat1 cat2 cat3 dogs dog1 dog2 monkey creatures monkey_creature1 monkey_creature2 monkey_creature3';

$current_prefix = '';
$potential_prefix_elements = array();

$word_mapping = array();

foreach(split(" ", $string) as $substring) {
    if(strlen($current_prefix)) {
        // Check to see if the current substring, starts with the prefix
        if(strrpos($substring, $current_prefix) === 0)
            $word_mapping[$current_prefix . 's'][] = $substring;
        else
            $current_prefix = '';
    }

    if(!strlen($current_prefix)) {
        if(preg_match("/(?P<new_prefix>.+)s$/", $substring, $matches)) {
            $potential_prefix_elements[] = $matches['new_prefix'];

            // Add an 's' to make the keys plural
            $current_prefix = join("_", $potential_prefix_elements);

            // Initialize an array for the current word mapping
            $word_mapping[$current_prefix . 's'] = array();

            // Clear the potential prefix elements
            $potential_prefix_elements = array();
        } else {
            $potential_prefix_elements[] = $substring;
        }
    }
}

print_r($word_mapping);

这是输出结果,我已将其作为数组提供给您,这样您可以轻松构建ul / li层次结构 :)
Array
(
    [cats] => Array
        (
            [0] => cat1
            [1] => cat2
            [2] => cat3
        )

    [dogs] => Array
        (
            [0] => dog1
            [1] => dog2
        )

    [monkey_creatures] => Array
        (
            [0] => monkey_creature1
            [1] => monkey_creature2
            [2] => monkey_creature3
        )

)

1
顺便说一下,这段代码也适用于你的狐猴示例,它是完全动态的,并且将基于复数开头关键字构建数组,然后检查每个后续单词以获取该前缀。一旦一个单词打破了与前缀的比较,脚本就开始构建一个新的键。 - Bryan
1
谢谢。非常及时,正是我所需要的! - envysea

2

您可能需要使用preg_match_all函数并使用正则表达式。这样,您就不必使用任何循环:

$matches = array();
$string = 'cats cat1 cat2 cat3 dogs dog1 dog2 monkey creatures monkey_creature1 monkey_creature2 monkey_creature3'
preg_match_all('/((?:[a-z]+ )*?[a-z]+s) ((?:[a-z_]+[0-9] ?)+)*/i', $string, $matches);

// $matches now contains multidemensional array with 3 elements, indices
// 1 and 2 contain the animal name and list of those animals, respectively
$animals = array_combine($matches[1], $matches[2]);
$animals = array_map(function($value) {
    return explode(' ', trim($value));
}, $animals);
print_r($animals);

输出:

Array
(
    [cats] => Array
        (
            [0] => cat1
            [1] => cat2
            [2] => cat3
        )

    [dogs] => Array
        (
            [0] => dog1
            [1] => dog2
        )

    [monkey creatures] => Array
        (
            [0] => monkey_creature1
            [1] => monkey_creature2
            [2] => monkey_creature3
        )

)

1

您的第二个示例作为字符串:

<?php

$parents = array('cats', 'dogs', 'monkey creatures', 'lemurs');
$result = array();

$dataString = 'cats cat1 cat2 cat3 cat4 cat5 cats6 dogs dogs1 dogs2 monkey creatures monkey_creature1 lemurs lemur1 lemur2 lemur3';
foreach ($parents as $parent) {
  // Consider group only if it is present in the data string
  if (strpos($dataString, $parent) !== false) {
    $result[$parent] = array();
  }
}
$parts = explode(' ', $dataString);
foreach (array_keys($result) as $group) {
  $normalizedGroup = str_replace(' ', '_', $group);
  foreach ($parts as $part) {
    if (preg_match("/^$normalizedGroup?\d+$/", $part)) {
      $result[$group][] = $part;
    }
  }
}
print_r($result);

输出:

Array
(
    [cats] => Array
        (
            [0] => cat1
            [1] => cat2
            [2] => cat3
            [3] => cat4
            [4] => cat5
            [5] => cats6
        )

    [dogs] => Array
        (
            [0] => dogs1
            [1] => dogs2
        )

    [monkey creatures] => Array
        (
            [0] => monkey_creature1
        )

    [lemurs] => Array
        (
            [0] => lemur1
            [1] => lemur2
            [2] => lemur3
        )

)

感谢您花时间编写这个。 - envysea

1

这是我的50美分

<?php
$parents = array('cats', 'dogs', 'lemurs', 'monkey creatures');

// Convert all spaces to underscores in parents
$cleaned_parents = array();
foreach ($parents as $parent)
{
        $cleaned_parents[] = str_replace(' ', '_', $parent);
}

$input = 'cats cat1 cat2 cat3 dogs dog1 dog2 monkey creatures monkey_creature1 monkey_creature2 monkey_creature3';

// Change all parents to the "cleaned" versions with underscores
$input = str_replace($parents, $cleaned_parents, $input);

// Make an array of all tokens in the input string
$tokens = explode(' ', $input);
$result = array();

// Loop through all the tokens
$currentParent = null; // Keep track of current parent
foreach ($tokens as $token)
{
    // Is this a parent?
    if (in_array($token, $cleaned_parents))
    {
        // Create the parent in the $result array
        $currentParent = $token;
        $result[$currentParent] = array();
    }
    elseif ($currentParent != null)
    {
        // Add as child to the current parent
        $result[$currentParent][] = $token;
    }
}

print_r($result);

输出:

Array
(
    [cats] => Array
        (
            [0] => cat1
            [1] => cat2
            [2] => cat3
        )

    [dogs] => Array
        (
            [0] => dog1
            [1] => dog2
        )

    [monkey_creatures] => Array
        (
            [0] => monkey_creature1
            [1] => monkey_creature2
            [2] => monkey_creature3
        )

)

1

我觉得我提交不了最好的答案,所以决定争取最少的代码行数。(开玩笑,抱歉代码非常混乱)

$string = 'cats cat1 cat2 cat3 cat4 cat5 cats6 dogs dogs1 dogs2 monkey creatures monkey_creature1 lemurs lemur1 lemur2 lemur3';
$categories = array( 'cats', 'dogs', 'monkey creatures', 'lemurs' );

for( $i=0; $i<count( $categories ); $i++ ) $parts[] = @explode( ' ', strstr( $string, $categories[$i] ) );
for( $i=0; $i<count( $parts ); $i++ ) $groups[] = ($i<count($parts)-1) ? array_diff( $parts[$i], $parts[$i+1] ) : $parts[$i];
for( $i=0; $i<count( $groups ); $i++ ) for( $j=0; $j<count( $groups[$i] ); $j++ ) if( ! is_numeric( substr( $groups[$i][$j], -1 ) ) ) unset($groups[$i][$j]);

print_r( $groups );

您可能会注意到我的方法取决于元素具有数字后缀的事实。实际上这是无意义的,但是因为我们要处理的输入是如此。

我的输出是:

Array
(
    [0] => Array
        (
            [1] => cat1
            [2] => cat2
            [3] => cat3
            [4] => cat4
            [5] => cat5
            [6] => cats6
        )

    [1] => Array
        (
            [1] => dogs1
            [2] => dogs2
        )

    [2] => Array
        (
            [2] => monkey_creature1
        )

    [3] => Array
        (
            [1] => lemur1
            [2] => lemur2
            [3] => lemur3
        )

)

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接