Python将列表分割为子列表，以开始和结束关键字模式为标志

Question

Python将列表分割为子列表，以开始和结束关键字模式为标志

10

如果我有一个列表，比如：

lst = ['foo', 'bar', '!test', 'hello', 'world!', 'word']

如果给定一个具有字符!的字符串，我该如何返回一个列表？

lst = ['foo', 'bar', ['test', 'hello', 'world'], 'word']

我在寻找解决方案时遇到了一些困难。这是我尝试过的一种方法：

def define(lst):
    for index, item in enumerate(lst):
        if item[0] == '!' and lst[index+2][-1] == '!':
            temp = lst[index:index+3]
            del lst[index+1:index+2]
            lst[index] = temp
    return lst

任何帮助都将不胜感激。

- Leo Whitehead

你想让你的子列表包含两个 ! 之间的所有元素吗？ - Ollie

元素是否可以以!开头和结尾，例如'!element!'？ - Azat Ibrakov

如果开放元素比关闭元素多怎么办？我们需要检查吗？ - Azat Ibrakov

不需要检查开/闭括号的匹配数量，也不需要嵌套子列表。一个元素不能以“'!element!'”开头和结尾。 - Leo Whitehead

6个回答

4

这里有一个迭代的解决方案，可以处理任意嵌套的列表:

def nest(lst, sep):
    current_list = []
    nested_lists = [current_list]  # stack of nested lists
    for item in lst:
        if item.startswith(sep):
            if item.endswith(sep):
                item = item[len(sep):-len(sep)]  # strip both separators
                current_list.append([item])
            else:
                # start a new nested list and push it onto the stack
                new_list = []
                current_list.append(new_list)
                current_list = new_list
                nested_lists.append(current_list)
                current_list.append(item[len(sep):])  # strip the separator
        elif item.endswith(sep):
            # finalize the deepest list and go up by one level
            current_list.append(item[:-len(sep)])  # strip the separator
            nested_lists.pop()
            current_list = nested_lists[-1]
        else:
            current_list.append(item)

    return current_list

测试运行：

>>> nest(['foo', 'bar', '!test', '!baz!', 'hello', 'world!', 'word'], '!')
['foo', 'bar', ['test', ['baz'], 'hello', 'world'], 'word']

它的工作原理是维护一个嵌套列表的堆栈。每次创建新的嵌套列表时，它会被推入堆栈中。元素始终附加到堆栈中最后一个列表中。当找到以 "!" 结尾的元素时，堆栈中的最上层列表将被移除。

- Aran-Fey

2

首先确定子列表的起始和结束点，然后相应地切割列表，最后移除 ! 符号。

def define(lst):
    # First find the start and end indexes
    for index, item in enumerate(lst):
        if item[0] == '!':
            start_index = index
        if item[-1] == "!":
            end_index = index+1

    # Now create the new list
    new_list = lst[:start_index] + [lst[start_index:end_index]] + lst[end_index:]

    # And remove the !s
    new_list[start_index][0] = new_list[start_index][0][1:]
    new_list[start_index][-1] = new_list[start_index][-1][:-1]

    return new_list

- Ollie

1

这仅适用于单个嵌套列表。类似['!foo'，'bar！'，'！x'，'y！']或['！foo'，'！x'，'y！'，'bar！']的内容会得到不正确的输出。 - Aran-Fey

是的，我假设只有一个集合的“！”。 - Ollie

2

这里有一个相当简单的实现：

lst = ['foo', 'bar', '!test', 'hello', 'world!', 'word']

lst_tmp = [(tuple(el.split()) if ' ' in (el[0], el[-1]) else el.split()) for el in ' '.join(lst).split('!')]
lst = []
for el in lst_tmp:
    if isinstance(el, tuple):
        for word in el:
            lst.append(word)
    else:
        lst.append(el)

首先，我们将lst连接成一个单独的str，然后在'!'上拆分它。现在，这导致['foo bar ', 'test hello world', ' word']。我们现在可以使用元素开头或结尾中出现的空格字符来表示嵌入式list应该出现的位置。应该单独出现的单词被打包成tuple，以便将它们与list区分开来。所有这些都导致了lst_tmp。最后要做的是将tuple解包为它们的单个元素，这就是循环正在执行的操作。

- jmd_dk

如果列表中的任何单词包含感叹号，比如 ['f!o!o']，那么这段代码将无法正确运行。它也无法处理任意嵌套，例如 ['!foo', '!x', 'y!', 'bar!']。 - Aran-Fey

0

我认为你应该将其插入到数组中而不是赋值。并且你还需要删除到索引+3的内容。

def define(lst):
    for index, item in enumerate(lst):
        if item[0] == '!' and lst[index+2][-1] == '!':
            temp = lst[index:index+3]
            del lst[index:index+3]
            lst.insert(index, temp)
    return lst

- atiq1589

你的缩进有问题。当我修复它时（好吧，我猜测一下。我不知道预期的缩进是什么），它会抛出一个异常，输入为['!foo', 'bar!']。 - Aran-Fey

1

如果你没有回答问题，我认为不应该将其发布为答案...指出问题可以在评论中完成。 - Aran-Fey

是的，我正在回答这个问题，但不是动态地回答，因为 OP 没有要求。我只是修正了 OP 的算法并发布了更正版本。 - atiq1589

然后帮我改进它。请给我一些意见。 - atiq1589

让我们在聊天中继续这个讨论。 - atiq1589

显示剩余3条评论

-1

请尝试以下内容：

lst = ['foo', 'bar', '!test', 'hello', 'world!', 'word']
temp =[]
isFound=False
for str in lst:
    if str.startswith("!"):
        temp.append(str,replace("!",""))
        isFound=True
    elif len(temp) and isFound and not str.endswith("!"):
        temp.append(str)
    elif str.endswith("!"):
        temp.append(str,replace("!",""))
        isFound=False
for item in temp:
    lst.remove(item)
lst.append(temp)

- Rehan Azher

1

这将子列表放在列表末尾，并不会删除 !。 - Ollie

是的，同意，但也可以进一步优化以实现它。但是@Aran-Fey提供的解决方案更好。 - Rehan Azher

1

你的代码中有很多错别字(str,replace)，如果我输入['!foo', 'bar!']，它会抛出一个ValueError。str.replace也不是正确的工具，因为它会替换所有感叹号的出现。如果列表中有像"!fo!o"这样的单词，结果将包含"foo"而不是"fo!o"。 - Aran-Fey

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Azat Ibrakov · Accepted Answer

假设不存在以 ! 开始和结束的元素，例如'!foo!'。

首先，我们可以编写辅助谓词，例如：

def is_starting_element(element):
    return element.startswith('!')


def is_ending_element(element):
    return element.endswith('!')

那么我们可以编写生成器函数（因为它们很棒）

def walk(elements):
    elements = iter(elements)  # making iterator from passed iterable
    for position, element in enumerate(elements):
        if is_starting_element(element):
            yield [element[1:], *walk(elements)]
        elif is_ending_element(element):
            yield element[:-1]
            return
        else:
            yield element

测试：

>>> lst = ['foo', 'bar', '!test', 'hello', 'world!', 'word']
>>> list(walk(lst))
['foo', 'bar', ['test', 'hello', 'world'], 'word']
>>> lst = ['foo', 'bar', '!test', '!hello', 'world!', 'word!']
>>> list(walk(lst))
['foo', 'bar', ['test', ['hello', 'world'], 'word']]
>>> lst = ['hello!', 'world!']
>>> list(walk(lst))
['hello']

正如我们从上一个例子中看到的那样，如果剩余的结束元素多于开始元素，则剩余的结束元素将被忽略（这是因为我们从生成器中 return）。因此，如果lst具有无效签名（打开和关闭元素之间的差异不等于零），那么我们可能会遇到一些不可预测的行为。为了摆脱这种情况，我们可以在处理数据之前验证给定的数据并在数据无效时引发错误。

我们可以编写类似以下的验证器：

def validate_elements(elements):
    def get_sign(element):
        if is_starting_element(element):
            return 1
        elif is_ending_element(element):
            return -1
        else:
            return 0

    signature = sum(map(get_sign, elements))
    are_elements_valid = signature == 0
    if not are_elements_valid:
        error_message = 'Data is invalid: '
        if signature > 0:
            error_message += ('there are more opening elements '
                              'than closing ones.')
        else:
            error_message += ('there are more closing elements '
                              'than opening ones.')
        raise ValueError(error_message)

测试

>>> lst = ['!hello', 'world!']
>>> validate_elements(lst)  # no exception raised, data is valid
>>> lst = ['!hello', '!world']
>>> validate_elements(lst)
...
ValueError: Data is invalid: there are more opening elements than closing ones.
>>> lst = ['hello!', 'world!']
>>> validate_elements(lst)
...
ValueError: Data is invalid: there are more closing elements than opening ones.

最后，我们可以编写带有验证的函数，例如：

def to_sublists(elements):
    validate_elements(elements)
    return list(walk(elements))

测试

>>> lst = ['foo', 'bar', '!test', 'hello', 'world!', 'word']
>>> to_sublists(lst)
['foo', 'bar', ['test', 'hello', 'world'], 'word']
>>> lst = ['foo', 'bar', '!test', '!hello', 'world!', 'word!']
>>> to_sublists(lst)
['foo', 'bar', ['test', ['hello', 'world'], 'word']]
>>> lst = ['hello!', 'world!']
>>> to_sublists(lst)
...
ValueError: Data is invalid: there are more closing elements than opening ones.

编辑

如果我们想要处理以!开始和结束的元素，例如'!bar!'，我们可以使用itertools.chain来修改walk函数。如下：

from itertools import chain


def walk(elements):
    elements = iter(elements)
    for position, element in enumerate(elements):
        if is_starting_element(element):
            yield list(walk(chain([element[1:]], elements)))
        elif is_ending_element(element):
            element = element[:-1]
            yield element
            return
        else:
            yield element

此外，我们需要通过修改 get_sign 函数来完成验证。

def get_sign(element):
    if is_starting_element(element):
        if is_ending_element(element):
            return 0
        return 1
    if is_ending_element(element):
        return -1
    return 0

测试

>>> lst = ['foo', 'bar', '!test', '!baz!', 'hello', 'world!', 'word']
>>> to_sublists(lst)
['foo', 'bar', ['test', ['baz'], 'hello', 'world'], 'word']