如何将字符串'aaaaaaaaaaaaaaaaaaaaaaa'
拆分为长度为4的元组,例如(aaaa
,aaaa
,aaaa
)?
textwrap.wrap
函数:>>> import textwrap
>>> s = 'aaaaaaaaaaaaaaaaaaaaaaa'
>>> textwrap.wrap(s, 4)
['aaaa', 'aaaa', 'aaaa', 'aaaa', 'aaaa', 'aaa']
textwrap
非常强大,对于像这样的精确任务来说,它提供了太多选项,比如将制表符替换为空格,修复句子标点等等。我更愿意使用一些更简单的东西,这样会更加舒适。 - Jonathan Hartley使用列表推导式、生成器表达式:
>>> s = 'aaaaaaaaaaaaaaaaaaaaaaa'
>>> [s[i:i+4] for i in range(0, len(s), 4)]
['aaaa', 'aaaa', 'aaaa', 'aaaa', 'aaaa', 'aaa']
>>> tuple(s[i:i+4] for i in range(0, len(s), 4))
('aaaa', 'aaaa', 'aaaa', 'aaaa', 'aaaa', 'aaa')
>>> s = 'a bcdefghi j'
>>> tuple(s[i:i+4] for i in range(0, len(s), 4))
('a bc', 'defg', 'hi j')
zip(*[iter(s)]*4)
,具体实现请参考这里。In [113]: s = 'aaaaaaaaaaaaaaaaaaaaaaa'
In [114]: [''.join(item) for item in zip(*[iter(s)]*4)]
Out[114]: ['aaaa', 'aaaa', 'aaaa', 'aaaa', 'aaaa']
textwrap.wrap
可能不会将s
拆分为长度为4的字符串:In [43]: textwrap.wrap('I am a hat', 4)
Out[43]: ['I am', 'a', 'hat']
使用石斑鱼食谱比使用textwrap
更快:
In [115]: import textwrap
In [116]: %timeit [''.join(item) for item in zip(*[iter(s)]*4)]
100000 loops, best of 3: 2.41 µs per loop
In [117]: %timeit textwrap.wrap(s, 4)
10000 loops, best of 3: 32.5 µs per loop
这个石斑鱼配方可以与任何迭代器一起使用,而textwrap
仅适用于字符串。
使用正则表达式的另一种解决方案:
>>> s = 'aaaaaaaaaaaaaaaaaaaaaaa'
>>> import re
>>> re.findall('[a-z]{4}', s)
['aaaa', 'aaaa', 'aaaa', 'aaaa', 'aaaa']
>>>
zip()
解决方案更容易一眼理解。它甚至可以轻松地更改为使用任意字符,包括换行符:re.findall('.{4}', s, re.DOTALL)
- 或者接受不完整的尾部:re.findall('.{1,4}', s, re.DOTALL)
。 - blubberdiblubs = 'abcdefghi'
k - 字符串的部分数量
k = 3
parts - 用于存储字符串部分的列表
parts = [s[i:i+k] for i in range(0, len(s), k)]
部件 --> ['abc','def','ghi']
s = 'abcdef'
[s[pos:pos+2] for pos,i in enumerate(list(s)) if pos%2 == 0]
答案:
['ab', 'cd', 'ef']
简单易懂的方式
def wrap(string, max_width):
i=0
strings = []
s = ""
for x in string:
i+=1
if i == max_width:
s = s + x
strings.append(s)
s = ""
i = 0
else:
s = s + x
strings.append(s)
return strings
wrap('ABCDEFGHIJKLIMNOQRSTUVWXYZ',4)
# output: ['ABCD', 'EFGH', 'IJKL', 'IMNO', 'QRST', 'UVWX', 'YZ']
这里是给定问题的另一个可能解决方案:
def split_by_length(text, width):
width = max(1, width)
chunk = ""
for v in text:
chunk += v
if len(chunk) == width:
yield chunk
chunk = ""
if chunk:
yield chunk
if __name__ == '__main__':
x = "123456789"
for i in range(20):
print(i, list(split_by_length(x, i)))
输出:
0 ['1', '2', '3', '4', '5', '6', '7', '8', '9']
1 ['1', '2', '3', '4', '5', '6', '7', '8', '9']
2 ['12', '34', '56', '78', '9']
3 ['123', '456', '789']
4 ['1234', '5678', '9']
5 ['12345', '6789']
6 ['123456', '789']
7 ['1234567', '89']
8 ['12345678', '9']
9 ['123456789']
10 ['123456789']
11 ['123456789']
12 ['123456789']
13 ['123456789']
14 ['123456789']
15 ['123456789']
16 ['123456789']
17 ['123456789']
18 ['123456789']
19 ['123456789']
我认为这种方法更简单。但是消息长度必须按照 split_size 进行分割。或者可以在消息中添加字母。例如:message =“lorem ipsum_”,然后可以删除添加的字母。
message = "lorem ipsum"
array = []
temp = ""
split_size = 3
for i in range(1, len(message) + 1):
temp += message[i - 1]
if i % split_size == 0:
array.append(temp)
temp = ""
print(array)
输出: ['lor', 'em ', 'ips']
s = 'dasffvvcsadcadscsdsdcsadssdfsdfsdfdfs'
delimiter = 5
def reccursive_split(data, delimiter, current_list = []):
if len(data) > delimiter:
current_list.append(data[:delimiter])
return reccursive_split(data[delimiter:], delimiter, current_list)
else:
current_list.append(data)
return current_list
print(reccursive_split(s, delimiter))