在这种情况下有一种方法可以找到重复项,但是如果在同一ID的字符串中存在一个以上的重复名称,则删除它们会成为问题。 这里是可以处理每个ID一个重复项的代码。
示例数据:
WITH
tbl AS
(
Select 1 "ID", 'a' "NAME_1", 'b' "NAME_2", 'c' "NAME_3" From Dual Union All
Select 1 "ID", 'c' "NAME_1", 'd' "NAME_2", 'a' "NAME_3" From Dual Union All
Select 2 "ID", 'd' "NAME_1", 'e' "NAME_2", 'a' "NAME_3" From Dual Union All
Select 2 "ID", 'c' "NAME_1", 'd' "NAME_2", 'b' "NAME_3" From Dual
),
lists AS
(
Select 1 "ID", 'a,c,b,d,c' "NAME" From Dual Union All
Select 2 "ID", 'd,c,e,a,b' "NAME" From Dual
),
创建CTE,将您的LISTAGG字符串与原始数据进行比较,找到重复值:
grid AS
(
Select DISTINCT l.ID, l.NAME,
CASE WHEN ( Length(l.NAME || ',') - Length(Replace(l.NAME || ',', t.NAME_1 || ',', '')) ) / Length(t.NAME_1 || ',') > 1 THEN NAME_1 END "NAME_1",
CASE WHEN ( Length(l.NAME || ',') - Length(Replace(l.NAME || ',', t.NAME_2 || ',', '')) ) / Length(t.NAME_2 || ',') > 1 THEN NAME_2 END "NAME_2",
CASE WHEN ( Length(l.NAME || ',') - Length(Replace(l.NAME || ',', t.NAME_3 || ',', '')) ) / Length(t.NAME_3 || ',') > 1 THEN NAME_3 END "NAME_3"
From
lists l
Inner Join
tbl t ON(t.ID = l.ID)
)
ID NAME NAME_1 NAME_2 NAME_3
2 d,c,e,a,b
1 a,c,b,d,c c
1 a,c,b,d,c c
主要SQL,使用Union组合语句,构建新字符串(删除第二次出现)并与原有字符串进行比较后将该新字符串放在后面。
SELECT DISTINCT l.ID, Nvl(g.NAME, l.NAME) NAME
FROM
lists l
LEFT JOIN
(
SELECT ID, CASE WHEN NAME_1 Is Not Null
THEN REPLACE(NAME, NAME, COALESCE( REPLACE( SubStr(NAME, 1, InStr(NAME, NAME_1, 1, 2) - 1) || SubStr(NAME, InStr(NAME, NAME_1, 1, 2) + Length(NAME_1)), ',,', ','), NULL ) )
END "NAME"
FROM grid
WHERE COALESCE(NAME_1, NAME_2, NAME_3) IS NOT NULL
UNION ALL
SELECT ID, CASE WHEN NAME_2 Is Not Null
THEN REPLACE(NAME, NAME, COALESCE( REPLACE( SubStr(NAME, 1, InStr(NAME, NAME_2, 1, 2) - 1) || SubStr(NAME, InStr(NAME, NAME_2, 1, 2) + Length(NAME_2)), ',,', ','), NULL ) )
END "NAME"
FROM grid
WHERE COALESCE(NAME_1, NAME_2, NAME_3) IS NOT NULL
UNION ALL
SELECT ID, CASE WHEN NAME_3 Is Not Null
THEN REPLACE(NAME, NAME, COALESCE( REPLACE( SubStr(NAME, 1, InStr(NAME, NAME_3, 1, 2) - 1) || SubStr(NAME, InStr(NAME, NAME_3, 1, 2) + Length(NAME_3)), ',,', ','), NULL ) )
END "NAME"
FROM grid
WHERE COALESCE(NAME_1, NAME_2, NAME_3) IS NOT NULL
) g ON(g.ID = l.ID And Length(g.NAME) < Length(l.NAME))
R e s u l t :
ID NAME
2 d,c,e,a,b
1 a,c,b,d
对于字符串中的多个出现或多个不同名称,应该进行一些递归或嵌套操作来完成它...