Python 3中替换重复单词

3
我想要翻译一段文本,它看起来像这样:

Engineering will save the world from inefficiency. Inefficiency is a blight on the world and its humanity.

并返回:
Engineering will save the world from inefficiency..is a blight on the . and its humanity.

那就是说,我想要删除重复的单词,并用 "." 替换它们。这是我开始编写代码的方式:
lines= ["Engineering will save the world from inefficiency.",
        "Inefficiency is a blight on the world and its humanity."]

def solve(lines):    
    clean_paragraph = []    
    for line in lines:    
        if line not in str(lines):
            clean_paragraph.append(line)
        print (clean_paragraph)    
        if word == word in line in clean_paragraph:
            word = "."              
     return clean_paragraph

我的逻辑是创建一个包含所有字符串中最差的单词的列表,并将每个单词添加到新列表中,如果该单词已经在列表中,则用“.”替换它。我的代码返回[]。如有建议,非常感谢!

1
{btsdaf} - chasmani
{btsdaf} - RomanPerekhrest
{btsdaf} - user8827983
{btsdaf} - RomanPerekhrest
3个回答

0

问题:

if word == word in line in clean_paragraph:

我不确定你对此有什么期望,但它总是会是False。这里加上一些澄清的括号:

if word == ((word in line) in clean_paragraph):

这个代码评估了line中的单词,它可能是布尔值。然而,该值将不会出现在clean_paragraph的文本中,因此结果表达式为False

修复

编写您要编码的循环:

for clean_line in clean_paragraph:
    for word in clean_line:

在这一点上,您将需要弄清楚变量名想要表示的内容。 您试图让几个变量同时表示两个不同的事物(lineword;我已经修复了第一个)。

您还需要学习如何正确操作循环及其索引;问题的一部分是您一次编写了比自己能处理的更多代码——但现在还没有。 回退,一次只编写一个循环,并打印结果,以便知道自己正在做什么。 例如,从以下内容开始:

for line in lines:

    if line not in str(lines):
        print("line", line, "is new: append")
        clean_paragraph.append(line)
    else:
        print("line", line, "is already in *lines*")

我认为你会发现另一个问题——甚至比我发现的还要早。先解决这个问题,然后每次只添加一两行代码,逐步构建你的程序(和编程知识)。当出现问题时,你就知道几乎肯定是新添加的那些代码有问题。

{btsdaf} - user8827983
{btsdaf} - Prune

0
这里有一种方法可以实现。它将所有重复的单词替换为一个点。
lines_test = (["Engineering will save the world from inefficiency.",
               "Inefficiency is a blight on the world and its humanity."])


def solve(lines):
    clean_paragraph = ""
    str_lines = " ".join(lines)
    words_lines = str_lines.replace('.', ' .').split()
    for word in words_lines:
        if word != "." and word.lower() in clean_paragraph.lower():
            word = " ."
        elif word != ".":
            word = " " + word
        clean_paragraph += word
    return clean_paragraph


print(solve(lines_test))

输出:

Engineering will save the world from inefficiency. . is . blight on . . and its humanity.

在进行比较之前,将单词或字符串转换为小写或大写(一致的形式)非常重要。


0

另一种做法可以是:

lines_test = 'Engineering will save the world from inefficiency. Inefficiency is a blight on the world and its humanity.'

text_array = lines_test.split(" ")
formatted_text = ''
for word in text_array:
    if word.lower() not in formatted_text:   
        formatted_text = formatted_text +' '+word
    else:
        formatted_text = formatted_text +' '+'.'

print(formatted_text)  

输出

Engineering will save the world from inefficiency. . is . blight on . . and its humanity.

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接