如何读取文件的最后n行？

Question

如何读取文件的最后n行？

3

我需要读取文件的最后4行。

我尝试了以下方法：

top_tb_comp_file = open('../../ver/sim/top_tb_compile.tcl', 'r+')
top_tb_comp_end = top_tb_comp_file.readlines()[:-4]
top_tb_comp_file.close()

没起作用（我在top_tb_comp_end中得到了文件的第一行）。

- sarad

2个回答

1

以下示例打开名为names.txt的文件，并打印文件中的最后4行。应用于您的示例，您只需要删除第2行、第5行和第7行中给定的模式。其余部分很简单。

#! /usr/bin/env python3
import collections


def main():
    with open('names.txt') as file:
        lines = collections.deque(file, 4)
    print(*lines, sep='')


if __name__ == '__main__':
    main()

- Noctis Skytower

谢谢你的技巧，但我不太确定它是否真的值得。更好的索引似乎更简单，并且实际上更快：

In [6]: import collections

In [7]: liste = list(range(1000000))

In [8]: %timeit lines = collections.deque(liste, 4)
100 loops, best of 3: 8.95 ms per loop

In [9]: %timeit lines = liste[-4:]
10000000 loops, best of 3: 115 ns per loop

- LucG

如果在这两种方法上运行内存分析，会发生什么？ - Noctis Skytower

我不知道怎么做。您能自己做并分享您的结果吗？ - LucG

这让我们两个都缺乏完成这样一个任务所需的知识。我可以在Windows中做一些近似，但大多数人可能更感兴趣的是看到Linux的结果。 - Noctis Skytower

我正在离开工作地点。我稍后会完成它并尽快分享。 - LucG

谢谢！您可能想要使用大文件测试这两种方法。在一个拥有一百万行文本的1 GB文本文件中，内存使用将会显着不同。 - Noctis Skytower

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- LucG · Accepted Answer

你的索引有误。使用[:-4]时，实际上是要求与你想要的完全相反。

尝试以下操作：

top_tb_comp_file = open('../../ver/sim/top_tb_compile.tcl', 'r+')
top_tb_comp_end = top_tb_comp_file.readlines()[-4:]
# you noticed that the '-4' is now before the ':'
top_tb_comp_file.close()

编辑

感谢@Noctis的帮助，我对问题进行了一些基准测试，比较了collection.deque选项和file.readlines选项在速度和内存使用方面的表现。

@Noctis建议的collection选项在内存使用和速度方面似乎更好：在我的结果中，我观察到在关键行file.readlines()[-4:]处内存使用量略有增加，而这种情况没有发生在collections.deque(file, 4)行。此外，我还通过文件读取阶段重复了速度测试，发现collections选项在这种情况下也更快。

我曾经遇到过一些问题，无法在SO渲染中显示此代码的输出，但如果您安装memory_profiler和psutil包，您应该能够看到自己的输出（使用大文件）。

import sys
import collections
import time

from memory_profiler import profile


@profile
def coll_func(filename):
    with open(filename) as file:
        lines = collections.deque(file, 4)
    return 0


@profile
def indexing_func(filename):
    with open(filename) as file:
        lines = file.readlines()[-4:]
    return 0


@profile
def witness_func(filename):
    with open(filename) as file:
        pass
    return 0


def square_star(s_toprint, ext="-"):
    def surround(s, ext="+"):
        return ext + s + ext

    hbar = "-" * (len(s_toprint) + 1)
    return (surround(hbar) + "\n"
            + surround(s_toprint, ext='|') + "\n"
            + surround(hbar))

if __name__ == '__main__':

    s_fname = sys.argv[1]
    s_func = sys.argv[2]

    d_func = {
        "1": coll_func,
        "2": indexing_func,
        "3": witness_func
    }

    func = d_func[s_func]

    start = time.time()
    func(s_fname)
    elapsed_time = time.time() - start

    s_toprint = square_star("Elapsed time:\t{}".format(elapsed_time))

    print(s_toprint)

只需输入以下内容：

python3 -m memory_profiler profile.py "my_file.txt" n

n为1、2或3。