您可以使用集合推导式:
df["elements"] = df["pairs"].apply(
lambda x: {ww for w in x for ww in w.split("|")}
)
print(df)
输出:
pairs elements
0 [A|B, B|C, C|D, D|F] {B, C, D, A, F}
1 [A|D, D|F, F|G] {G, D, F, A}
2 [C|D, D|X] {X, C, D}
如果您想要列表:
df["elements"] = df["pairs"].apply(
lambda x: list({ww for w in x for ww in w.split("|")})
)
print(df)
pairs elements
0 [A|B, B|C, C|D, D|F] [D, F, A, C, B]
1 [A|D, D|F, F|G] [G, D, A, F]
2 [C|D, D|X] [X, D, C]
编辑:为了维护秩序:
def fn(x):
seen = set()
out = []
for v in x:
for w in v.split("|"):
if not w in seen:
seen.add(w)
out.append(w)
return out
df["elements"] = df["pairs"].apply(fn)
print(df)
输出:
pairs elements
0 [A|B, B|C, C|D, D|F] [A, B, C, D, F]
1 [A|D, D|F, F|G, G|D] [A, D, F, G]
2 [C|D, D|X] [C, D, X]
编辑:为保留多个元素和顺序:
from itertools import groupby, chain
def fn(x):
return [v for v, _ in groupby(chain.from_iterable(v.split("|") for v in x))]
df["elements"] = df["pairs"].apply(fn)
print(df)
输出:
pairs elements
0 [A|B, B|C, C|D, D|F] [A, B, C, D, F]
1 [A|D, D|F, F|G, G|D] [A, D, F, G, D]
2 [C|D, D|X] [C, D, X]