使用mask
和str.contains()
来执行符合指定条件的行的操作,然后使用以下操作:.str.split(', ').str[0:2].agg(', '.join))
:
df['Col'] = df['Col'].mask(df['Col'].str.contains('County, Texas'),
df['Col'].str.split(', ').str[0:2].agg(', '.join))
完整代码:
import pandas as pd
df = pd.DataFrame({'Col': {0: 'Jack Smith, Bank, Wilber, Lincoln County, Texas',
1: 'Jack Smith, Union, Credit, Bank, Wilber, Lincoln County, Texas',
2: 'Jack Smith, Union, Credit, Bank, Wilber, Lincoln County, Texas, Branch, Landing, Services',
3: 'Jack Smith, Union, Credit, Bank, Wilber, Branch, Landing, Services'}})
df['Col'] = df['Col'].mask(df['Col'].str.contains('County, Texas'),
df['Col'].str.split(', ').str[0:2].agg(', '.join))
df
Out[1]:
Col
0 Jack Smith, Bank
1 Jack Smith, Union
2 Jack Smith, Union
3 Jack Smith, Union, Credit, Bank, Wilber, Branc...
根据更新的问题,您可以使用np.select:
import pandas as pd
df = pd.DataFrame({'Col': {0: 'Jack Smith, Bank, Wilber, Lincoln County, Texas',
1: 'Jack Smith, Bank, Credit, Bank, Wilber, Lincoln County, Texas',
2: 'Jack Smith, Bank, Union, Credit, Bank, Wilber, Lincoln County, Texas, Branch, Landing, Services',
3: 'Jack Smith, Bank, Credit, Bank, Wilber, Branch, Landing, Services'}})
df['Col'] = np.select([df['Col'].str.contains('County, Texas') & ~df['Col'].str.contains('Union'),
df['Col'].str.contains('County, Texas') & df['Col'].str.contains('Union')],
[df['Col'].str.split(', ').str[0:2].agg(', '.join),
df['Col'].str.split(', ').str[0:3].agg(', '.join)],
df['Col'])
df
Out[2]:
Col
0 Jack Smith, Bank
1 Jack Smith, Bank
2 Jack Smith, Bank, Union
3 Jack Smith, Bank, Credit, Bank, Wilber, Branch...
^([^,]*,[^,]*),.*County, Texas.*
并替换为\1
捕获 group(1) 的想法。 - bobble bubble