当我在一个.srt文件上调用readlines()
时,我得到了一个字符列表,其中有很多前导和尾随的空格,如下所示
with open(infile) as f:
r=f.readlines()
return r
我得到了这个列表。
['\xef\xbb\xbf1\r\n', '00:00:00,000 --> 00:00:03,000\r\n', "[D. Evans] Now that you've written your first Python program,\r\n",'\r\n', '2\r\n', '00:00:03,000 --> 00:00:06,000\r\n', 'you might be wondering why we need to invent new languages like Python\r\n', '\r\n']
我仅仅为了简洁起见只包含了一些元素。我该如何清理这个列表,以便我可以移除所有的空白字符,并获取只包含相关元素的内容?
['1','00:00:00,000 --> 00:00:03,000',"[D. Evans] Now that you've written your first Python program"...]
\xef\xbb\xbf
看起来像是以 UTF-8 编码的 BOM。 - Mark Byers