忽略层次查询中的单个子项（竹子部件）

Question

忽略层次查询中的单个子项（竹子部件）

3

我有一个带有层次数据的表格，如下所示。

create table tst as 
select 1 id, null parent_id from dual union all
select 2 id, 1 parent_id from dual union all
select 3 id, 1 parent_id from dual union all
select 4 id, 2 parent_id from dual union all
select 5 id, 3 parent_id from dual union all
select 6 id, 5 parent_id from dual union all
select 7 id, 6 parent_id from dual union all
select 8 id, 6 parent_id from dual;

使用CONNECT BY语句遍历层次结构非常简单。

我有一个提取的需求，即忽略树形结构中简单（竹子般的）部分。也就是说，如果父节点只有一个子节点，则将两个节点连接起来，并将它们的ID串联起来（该规则递归应用）。

因此，预期的结果是：

       ID  PARENT_ID
---------- ----------
         1            
         2,4        1 
         3,5,6      1 
         7          3,5,6 
         8          3,5,6

更新另一种正确的方法是将连接的节点列表添加到原始ID中并重复使用。

        ID  PARENT_ID NODE_LST 
---------- ---------- ---------
         1            1       
         4          1 2,4     
         6          1 3,5,6   
         7          6 7       
         8          6 8

到目前为止，我已经成功计算了子节点，并构建了完整的路径到子节点计数和ID的根节点...

with child_cnt as (
-- child count per parent
select parent_id, count(*) cnt 
from tst
where parent_id is not NULL
group by parent_id),
tst2 as (
select 
  ID,  child_cnt.cnt,
  tst.parent_id
from tst left outer join child_cnt on tst.parent_id = child_cnt.parent_id),
tst3 as (
SELECT id, parent_id,
  sys_connect_by_path(cnt,',') child_cnt_path,
  sys_connect_by_path(id,',') path
FROM tst2
  START WITH parent_id IS NULL
  CONNECT BY  parent_id  = PRIOR id
)
select * from tst3
;


        ID  PARENT_ID CHILD_CNT_PATH PATH       
---------- ---------- -------------- ------------
         1            ,              ,1           
         2          1 ,,2            ,1,2         
         4          2 ,,2,1          ,1,2,4       
         3          1 ,,2            ,1,3         
         5          3 ,,2,1          ,1,3,5       
         6          5 ,,2,1,1        ,1,3,5,6     
         7          6 ,,2,1,1,2      ,1,3,5,6,7   
         8          6 ,,2,1,1,2      ,1,3,5,6,8

这表明在ID 4和5上需要跳过一级（一个尾随子级数为1），在ID 6上需要跳过两级（路径计数中有两个尾随的一级）。

但我认为应该有一种更简单的方法来解决这个问题。

- Marmite Bomber

2个回答

1

这不是很优雅，但应该可以工作。如果我能找到更好的方法来完成最后一部分，我会进行编辑。祝你好运！

with
     d ( id, parent_id, degree ) as (
       select id, parent_id, count(parent_id) over (partition by parent_id)
       from   tst
     ),
     x ( old_id, new_id ) as (
       select id, ltrim(sys_connect_by_path(id, ','), ',')
       from   d
       where connect_by_isleaf = 1
       start with degree != 1
       connect by parent_id = prior id
       and        degree = 1
     )
select x1.new_id as id, x2.new_id as parent_id
from   x x1 
            inner join tst 
                 on tst.id        = regexp_substr(x1.new_id, '^[^,]+')
            left outer join x x2
                 on tst.parent_id = x2.old_id
;

- user5683823

将度数为1的节点连接起来是个好主意。我会重新考虑后明天接受 :) - Marmite Bomber

我想不出如何避免最后的两个连接。如果我能在竹节中跟踪第一个节点的父节点而不需要连接（在原始表中查找），那就太好了；但是，我所考虑的所有方法都会使首先识别竹节变得更加麻烦。祝你好运！ - user5683823

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Peter M. · Accepted Answer

这个查询将带您到替代解决方案。

虽然可能还有一些进一步的优化或需要修复的错误，但它适用于您的测试案例。

WITH nodes_to_dispose as (
    SELECT min(id) as id,
           parent_id
    FROM tst
    WHERE parent_id is not null
    GROUP BY parent_id
    HAVING count(*) = 1 )
-- This part returns merged bamboo nodes
SELECT nodes_to_dispose.id,
       connect_by_root tst.parent_id as parent_id,
       connect_by_root nodes_to_dispose.parent_id ||
               sys_connect_by_path(nodes_to_dispose.id, ',') as node_lst
FROM nodes_to_dispose, tst
WHERE nodes_to_dispose.parent_id = tst.id (+)
AND connect_by_isleaf = 1
START WITH nodes_to_dispose.parent_id not in (
    SELECT id
    FROM nodes_to_dispose )
CONNECT BY prior nodes_to_dispose.id = nodes_to_dispose.parent_id
UNION
-- This part returns all other nodes in their original form
SELECT id, parent_id, to_char(id) as node_lst
FROM tst
WHERE id not in (
    SELECT parent_id
    FROM nodes_to_dispose
    UNION
    SELECT id
    FROM nodes_to_dispose);