在一个单词列表中查找二元组

Question

在一个单词列表中查找二元组

3

如何在列表中查找二元组？例如，如果我想查找“”在列表中的位置。

bigram = list(nltk.bigrams("New York"))

在一个单词列表中，

words = nltk.corpus.brown.words(fileids=["ca44"])

我尝试过做这件事情，

for t in bigram:
        if t in words:
             *do something*

除此之外，

if bigram in words:
   *do something*

- seus

2个回答

1

你可以编写一个生成器，为你的单词列表产生二元组：

def pairwise(iterable):
    """Iterate over pairs of an iterable."""
    i = iter(iterable)
    j = iter(iterable)
    next(j)
    yield from zip(i, j)

（例如，list(pairwise(["this", "is", "a", "test"])) 将返回 [('this', 'is'), ('is', 'a'), ('a', 'test')]。）

然后将其与 .bigrams() 的结果一起压缩：

for pair in pairwise(words):
    for bigram in nltk.bigrams("New York"):
        if bigram == pair:
            pass  # found

- L3viathan

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Selcuk · Accepted Answer

.bigrams() 会返回一个元组生成器。您应该首先将元组转换为字符串。例如：

bigram_strings = [''.join(t) for t in bigram]

那么你可以这样做。

for t in bigram_strings:
    if t in words:
         *do something*