在字符串切片中包含完整的单词

Question

在字符串切片中包含完整的单词

4

以下是我拥有的一些代码：

def breakUp(x,chunk_size):
    return [ x[i:i+chunk_size] for i in range(0, len(x), chunk_size) ]

以下是其工作原理：

In [8]: breakUp('This is a cool sentence... How about eating it??? Whats more?? pepper is available all for free!!!',10)

Out[8]: 
['This is a ',
 'cool sente',
 'nce... How',
 ' about eat',
 'ing it??? ',
 'Whats more',
 '?? pepper ',
 'is availab',
 'le all for',
 ' free!!!']

但是你可以看到第二个元素中单词"sentence"没有被完整地取出来，它只显示了"sente"...

我知道这是因为我让Python在每10个字符后分割...有没有办法指定当第10个字符以一个单词结尾时，将整个单词取出来并进行分割...？

- tenstar

3

听起来像是作业题，但我会将这行文字分割成单词（以空格为分隔符定义），然后逐步添加单词和空格，直到总长度大于或等于“chunk_size”。 - gaqzi

不，我希望这个块的长度是x个字符。我知道这个限制会导致单词被分开。但现在我想要的是这个块的长度应该大约为x个字符。我可以移动一些字符来包含那些无法适应的单词。 - tenstar

然后确定您可以允许多少模糊度作为数字，将单词添加到字符串中，然后检查len并查看它是否在您的+-chunk_size范围内。 - gaqzi

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Roman Bodnarchuk · Accepted Answer

内置电池：

>>> import textwrap
>>> print textwrap.fill('This is a cool sentence... How about eating it??? Whats more?? pepper is available all for free!!!', 15)
This is a cool
sentence... How
about eating
it??? Whats
more?? pepper
is available
all for free!!!

这个函数几乎能够满足您的所有需求。除非您将第二个参数指定为10，否则它仍然会分割sentence...，因为无法将其放入10个字符中。但是，如果您想要实现这一点，可以通过break_long_words=False自定义textwrap：

>>> print textwrap.fill('This is a cool sentence... How about eating it??? Whats more?? pepper is available all for free!!!', 10, break_long_words=False)
This is a
cool
sentence...
How about
eating
it???
Whats
more??
pepper is
available
all for
free!!!