从文件中读取一行内容,但不要在末尾获取"\n"。

20

我的文件名是"xml.txt",内容如下:

books.xml 
news.xml
mix.xml

如果我使用readline()函数,它会在所有文件名的末尾添加"\n",这是一个错误,因为我想要打开xml.txt中包含的文件。我写了这个:

fo = open("xml.tx","r")
for i in range(count.__len__()): #here count is one of may arrays that i'm using
    file = fo.readline()
    find_root(file) # here find_root is my own created function not displayed here

运行此代码时出现错误:

IOError: [Errno 2] No such file or directory: 'books.xml\n'

4
不要使用 count.__len__(),而是使用 len(count) - Ferdinand Beyer
虽然问题特别提到了 '\n' 字符,但实际上更普遍的问题是读取文件中不带行尾的一行内容。几乎所有的回答都没有解决这个问题(Daniel F. 的回答似乎有所涉及)。 - brianmearns
8个回答

42

仅删除末尾的换行符:

line = line.rstrip('\n')

readline 保留换行符的原因是为了区分空行(有换行符)和文件结尾(空字符串)。


19

7
你可以使用字符串对象的.rstrip()方法来获取去除尾部空格(包括换行符)的版本。
例如:
find_root(file.rstrip())

你能告诉我语法是什么吗?我的意思是在哪里以及如何添加它? - POOJA GUPTA
该解决方案将删除所有尾随的空格,而不仅仅是换行符。如果读取的行是 'foo \n',那么 .rstrip() 将返回 'foo',而按照问题陈述所需的是 'foo ' - Susam Pal

3

我出于好奇,计时了一下。以下是非常大文件的结果。

简而言之,对于大文件,读取文件后再分割似乎是最快的方法。

with open(FILENAME, "r") as file:
    lines = file.read().split("\n")

然而,如果您需要循环遍历这些行,则可能需要:

with open(FILENAME, "r") as file:
    for line in file:
        line = line.rstrip("\n")

Python 3.4.2

import timeit


FILENAME = "mylargefile.csv"
DELIMITER = "\n"


def splitlines_read():
    """Read the file then split the lines from the splitlines builtin method.

    Returns:
        lines (list): List of file lines.
    """
    with open(FILENAME, "r") as file:
        lines = file.read().splitlines()
    return lines
# end splitlines_read

def split_read():
    """Read the file then split the lines.

    This method will return empty strings for blank lines (Same as the other methods).
    This method may also have an extra additional element as an empty string (compared to
    splitlines_read).

    Returns:
        lines (list): List of file lines.
    """
    with open(FILENAME, "r") as file:
        lines = file.read().split(DELIMITER)
    return lines
# end split_read

def strip_read():
    """Loop through the file and create a new list of lines and removes any "\n" by rstrip

    Returns:
        lines (list): List of file lines.
    """
    with open(FILENAME, "r") as file:
        lines = [line.rstrip(DELIMITER) for line in file]
    return lines
# end strip_readline

def strip_readlines():
    """Loop through the file's read lines and create a new list of lines and removes any "\n" by
    rstrip. ... will probably be slower than the strip_read, but might as well test everything.

    Returns:
        lines (list): List of file lines.
    """
    with open(FILENAME, "r") as file:
        lines = [line.rstrip(DELIMITER) for line in file.readlines()]
    return lines
# end strip_readline

def compare_times():
    run = 100
    splitlines_t = timeit.timeit(splitlines_read, number=run)
    print("Splitlines Read:", splitlines_t)

    split_t = timeit.timeit(split_read, number=run)
    print("Split Read:", split_t)

    strip_t = timeit.timeit(strip_read, number=run)
    print("Strip Read:", strip_t)

    striplines_t = timeit.timeit(strip_readlines, number=run)
    print("Strip Readlines:", striplines_t)
# end compare_times

def compare_values():
    """Compare the values of the file.

    Note: split_read fails, because has an extra empty string in the list of lines. That's the only
    reason why it fails.
    """
    splr = splitlines_read()
    sprl = split_read()
    strr = strip_read()
    strl = strip_readlines()

    print("splitlines_read")
    print(repr(splr[:10]))

    print("split_read", splr == sprl)
    print(repr(sprl[:10]))

    print("strip_read", splr == strr)
    print(repr(strr[:10]))

    print("strip_readline", splr == strl)
    print(repr(strl[:10]))
# end compare_values

if __name__ == "__main__":
    compare_values()
    compare_times()

结果:

run = 1000
Splitlines Read: 201.02846901328783
Split Read: 137.51448011841822
Strip Read: 156.18040391519133
Strip Readline: 172.12281272950372

run = 100
Splitlines Read: 19.956802833188124
Split Read: 13.657361738959867
Strip Read: 15.731161020969516
Strip Readlines: 17.434831199281092

run = 100
Splitlines Read: 20.01516321280158
Split Read: 13.786344555543899
Strip Read: 16.02410587620824
Strip Readlines: 17.09326775703279

在大文件处理中,读取文件后进行分割似乎是最快的方法。

注意:使用read然后split("\n")会在列表末尾有一个额外的空字符串。

注意:使用read然后splitlines()会检查更多的内容,可能包括"\r\n"。


1
一个使用案例,包含@Lars Wirzenius的答案:
with open("list.txt", "r") as myfile:
    for lines in myfile:
        lines = lines.rstrip('\n')    # the trick
        try:
            with open(lines) as myFile:
                print "ok"
        except IOError as e:
            print "files does not exist"

1
要去除末尾的换行符,您也可以使用以下方式:
for line in file:
   print line[:-1]

1

更好的做法是使用上下文管理器来处理文件,并使用len()而不是调用.__len__()

with open("xml.tx","r") as fo:
    for i in range(len(count)): #here count is one of may arrays that i'm using
        file = next(fo).rstrip("\n")
        find_root(file) # here find_root is my own created function not displayed here

1
你忘了提到好的Python编程风格还包括不要用自己的变量名来隐藏内置函数,比如file... - martineau
@martineau,是的,我让那个过时的东西滑过去了。 - John La Rooy

0
# mode : 'r', 'w', 'a'
f = open("ur_filename", "mode")
for t in f:
    if(t):
        fn.write(t.rstrip("\n"))

“If” 条件将检查该行是否具有字符串,如果有,下一行将剥离末尾的 “\n” 并写入文件。 代码已测试。 ;)


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接