SQL使用group by和intersect

Question

SQL使用group by和intersect

sqlsql-servert-sqlgroup-by

3

考虑到这两个不同的项目组，如何找到在set2中跨越一个以上组的set1组？换句话说，如何找到set1中无法被set2中单个组覆盖的组？

例如，对于下面的表格，A（1、2、5）是唯一横跨s1（1、2、3）和s2（2、3、4、5）的组。B和C都不是答案，因为它们都被单个组s2覆盖。

我更喜欢使用SQL（可用的Sql Server 2008 R2）。

谢谢。

set1                            set2
 +---------+----------+          +---------+----------+
 | group   |  item    |          | group   |  item    |
 `````````````````````+          `````````````````````+
 |   A     |    1     |          |   s1    |    1     |
 |   A     |    2     |          |   s1    |    2     |
 |   A     |    5     |          |   s1    |    3     |
 |   B     |    4     |          |   s2    |    2     |
 |   B     |    5     |          |   s2    |    3     |
 |   C     |    3     |          |   s2    |    4     |
 |   C     |    5     |          |   s2    |    5     |
 +---------+----------+          +---------+----------+

请使用此sqlfiddle进行尝试：http://sqlfiddle.com/#!6/fac8a/3 或者使用下面的脚本生成临时表来尝试答案：

create table #set1 (grp varchar(5),item int)
create table #set2 (grp varchar(5),item int)

insert into #set1 select 'a',1 union select 'a',2 union select 'a',5 union select 'b',4 union select 'b',5 union select 'c',3 union select 'c',5
insert into #set2 select 's1',1 union select 's1',2 union select 's1',3 union select 's2',2 union select 's2',3 union select 's2',4 union select 's2',5

select * from #set1
select * from #set2

--drop table #set1
--drop table #set2

- S2L

那么你期望的结果是什么？A？还是... - MrSimpleMind

这是正确的。B和C都可以仅通过s2满足。 - S2L

C语言中的s1和s2没有跨度吗？因为3也在s1中。 - Giorgi Nakeuri

我理解你的意思。我应该重新表述为“找到无法由set2中单个组满足的set1组”。谢谢。 - S2L

7个回答

2

好的。虽然不太美观，但应该能用。我在fiddle上试过了。我认为可以通过窗口处理来完成，但我还需要考虑一下。

暂时先用这个不太美观的方案。

WITH d1 AS (
SELECT set1.grp
     , COUNT(*) cnt
  FROM set1
 GROUP BY set1.grp
), d2 AS (  
SELECT set1.grp grp1
     , set2.grp grp2
     , COUNT(set1.item) cnt
  FROM set1
 INNER JOIN set2
    ON set1.item = set2.item
 GROUP BY set1.grp
     , set2.grp
 )
SELECT grp
  FROM d1
EXCEPT  
SELECT d1.grp 
  FROM d1
 INNER JOIN d2
    ON d2.grp1 = d1.grp
   AND d2.cnt = d1.cnt

- Robert Co

我一点也不认为那很丑。我比接受的答案更喜欢它。 - paparazzo

0

您可以通过以下查询找到解决方案：

SELECT A.GROUP AS G1, A.ITEM AS T1, B.GROUP, B.ITEM
FROM SET1 A RIGHT JOIN SET2 B ON A.ITEM=B.ITEM
WHERE A.GROUP IS NULL

- ganesh

没成功。请尝试使用我在问题中添加的示例表格。 - S2L

0

你能检查一下吗？

SELECT DISTINCT a.Group1, a.Item, b.CNT
FROM SET1 a
INNER JOIN
(SELECT GroupA, COUNT(*) CNT
 FROM
 (
    SELECT DISTINCT a.Group1 GroupA, b.Group1 GroupB
    FROM SET1 a
        INNER JOIN SET2 b ON a.Item = b.Item
 ) a GROUP BY GroupA
) b ON a.Group1 = b.GroupA
 WHERE b.CNT > 1

- ps_prakash02

这将返回A和C，这是不正确的。我在SQL中放置了示例数据以帮助尝试。 - S2L

C 有 3 和 5。其中 3 被 S1 包含，而 5 被 S2 包含。 - ps_prakash02

希望你能从编辑中理解我的意思（在set1中找到那些不能被set2中的单个组覆盖的组）。 - S2L

0

感谢您的评论。我相信下面编辑过的查询将会起作用：

Select distinct grp1, initialRows, max(MatchedRows) from 
  (
  select a.grp as grp1, b.grp as grp2
  , count(distinct case when b.item is not null then a.item end) as MatchedRows
  , d.InitialRows
  from set1 a
  left join set2 b
  on a.item = b.item
  left join 
    (select grp, count(distinct Item) as InitialRows from set1
     group by grp) d
  on a.grp = d.grp
  group by a.grp, b.grp, InitialRows
) c
group by grp1, InitialRows
having max(MatchedRows) < InitialRows

- APH

我尝试修复了一下你的查询，但是它没有给出正确的答案。这是 SQLFiddle 的链接：http://sqlfiddle.com/#!6/90511/2 - S2L

谢谢@S2L！有了您的编辑，我成功让SqlFiddle工作了，我认为我已经做出了必要的更正！周末没有SQL服务器副本访问时，这总是更难做到的... - APH

0

我认为这个可以解决问题。子查询返回每个set1组中与所有set1项都匹配的set2组，通过计算匹配项并将其与set1组数量进行比较。

select s.grp from #set1 s
group by s.grp
having not exists ( 
    select s2.grp from #set2 s2 inner join #set1 s1 on s2.item = s1.item
    where s1.grp = s.grp
    group by s2.grp
    having count(s.item) = count(s2.item)   
    )

- dePatinkin

0

基本上和Robert Co一样
我没有从他的回答中得到这个 - 这是我独立想出来的

    select set1.group  
      from set1
except
    select set1count.group 
      from ( select set1.group            , count(*) as [count]  
               from set1 
           ) as set1count 
      join ( select set1.group as [group1], count(*) as [count]  
               from set1 
               join set2 
                 on set2.item = set1.item 
              group by set1.group, set2.group -- this is the magic
           ) as set1count 
        on set1count.group = set2count.[group1] -- note no set2.group match
       and set1count.count = set12count.count -- the items in set1 are in at least on set2 group

- paparazzo

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Giorgi Nakeuri · Accepted Answer

从set1中选择组，使得在set2中不存在所有set1项目的组：

select s1.grp from set1 s1
where not exists(
  select * from set2 s2 where not exists(
    select item from set1 s11 
    where s11.grp = s1.grp 
    except 
    select item from set2 s22
    where s22.grp = s2.grp))
group by s1.grp