使用Python统计文本文件中的行数、单词数和字符数

Question

使用Python统计文本文件中的行数、单词数和字符数

12

我在用Python处理文本文件时遇到了一些困难，想要计算其中某些元素的数量。目前我已经学习了几个月的Python，并熟悉以下函数：

raw_input
open
split
len
print
rsplit()

以下是我目前的代码：

fname = "feed.txt"
fname = open('feed.txt', 'r')

num_lines = 0
num_words = 0
num_chars = 0

for line in feed:
    lines = line.split('\n')

此时我不确定接下来该做什么。我认为最合理的方法是先计算行数，然后计算每行中的单词数，最后计算每个单词中的字符数。但我遇到的问题之一是尝试同时执行所有必要的函数，而不必重新打开文件以单独执行每个函数。

- Alex Karpowitsch

我认为你的意思是 feed = open(...)。另外，不使用 wc 有什么原因吗？ - Brian Donovan

你说得对。我也会再深入了解一下如何使用“wc”，谢谢你提供的链接。 - Alex Karpowitsch

5个回答

3

可能有用的功能：

open("file").read()，它一次性读取整个文件的内容
'string'.splitlines()，它将行与行分开（并且会忽略空行）

通过使用 len() 和这些函数，您可以完成您正在做的事情。

- kynnysmatto

2

fname = "feed.txt"
feed = open(fname, 'r')

num_lines = len(feed.splitlines())
num_words = 0
num_chars = 0

for line in lines:
    num_words += len(line.split())

- Stephen Paulger

2

file__IO = input('\nEnter file name here to analize with path:: ')
with open(file__IO, 'r') as f:
    data = f.read()
    line = data.splitlines()
    words = data.split()
    spaces = data.split(" ")
    charc = (len(data) - len(spaces))

    print('\n Line number ::', len(line), '\n Words number ::', len(words), '\n Spaces ::', len(spaces), '\n Charecters ::', (len(data)-len(spaces)))

我尝试了这段代码，它按照预期工作。

最初的回答：

- Ozzius

1

我喜欢的其中一种方法是这个，但可能只适用于小文件。

with open(fileName,'r') as content_file:
    content = content_file.read()
    lineCount = len(re.split("\n",content))
    words = re.split("\W+",content.lower())

要计算单词数，有两种方法，如果您不关心重复，只需执行以下操作：

words_count = len(words)

如果您想要每个单词的计数，只需执行以下操作

import collections
words_count = collections.Counter(words) #Count the occurrence of each word

- sirus

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- eumiro · Accepted Answer

试试这个：

fname = "feed.txt"

num_lines = 0
num_words = 0
num_chars = 0

with open(fname, 'r') as f:
    for line in f:
        words = line.split()

        num_lines += 1
        num_words += len(words)
        num_chars += len(line)

回到你的代码：

fname = "feed.txt"
fname = open('feed.txt', 'r')

这是什么意思呢？fname首先是一个字符串，然后是一个文件对象。你实际上并没有使用第一行中定义的字符串，而且你应该为每个变量只使用一个目的：要么是字符串，要么是文件对象。

for line in feed:
    lines = line.split('\n')

line 是文件中的一行内容。使用 split('\n') 来分割它是没有意义的。