在SQL Server中水平连接两个以上的表格

4
以下是模式(schema)的内容。
+---------+---------+
| Employee Table    |
+---------+---------+
| EmpId   | Name    | 
+---------+---------+
| 1       | John    |
| 2       | Lisa    |
| 3       | Mike    |
|         |         |
+---------+---------+

+---------+-----------------+
| Family   Table            |
+---------+-----------------+
| EmpId   | Relationship    | 
+---------+-----------------+
| 1       | Father          |
| 1       | Mother          |
| 1       | Wife            |
| 2       | Husband         |
| 2       | Child           |
+---------+-----------------+

+---------+---------+
| Loan  Table       |
+---------+--------+
| LoanId  | EmpId  | 
+---------+--------+
| L1      | 1      |
| L2      | 1      |
| L3      | 2      |
| L4      | 2      |
| L5      | 3      |
+---------+--------+
  • 员工表和家庭表之间存在一对多的关系
  • 员工表和贷款表之间存在一对多的关系

我尝试使用连接查询,但是它会返回冗余的行。

现在期望的输出结果为:

+---------+---------+--------------+---------+
| EmpId   | Name    | RelationShip | Loan    | 
+---------+---------+--------------+---------+
| 1       | John    | Father       | L1      |
| -       | -       | Mother       | L2      |
| -       | -       | Wife         | -       |
| 2       | Lisa    | Husband      | L3      |
| -       | -       | Child        | L4      |
| 3       | Mike    | -            | L5      |
|         |         |              |         |
+---------+---------+--------------+---------+    

贷款属于员工(1:N),家庭关系也属于员工(1:N)。如何将贷款L2和L4归属于非员工 - 这些信息不包含在数据中? - StuartLC
一个选择没有固定的顺序,也没有排序来产生这个。使用现有的列无法完成此操作。 - paparazzo
3个回答

5

看起来您正在尝试按顺序将贷款分配给家庭表中的行。解决此问题的方法是首先获取正确的行,然后再将贷款分配给这些行。

正确的行(以及前三列)如下:

select f.EmpId, e.Name, f.Relationship
from family f join
     Employee e
     on f.empid = e.empid;

请注意,这并不是将重复值的连字符放入列中,而是将实际值放入其中。虽然您可以在SQL中安排连字符,但这是一个不好的想法。 SQL结果以表格形式呈现,这些表格是无序集合,每个列和每行都有值。当您开始添加连字符时,就依赖于顺序。
现在的问题是加入贷款。通过使用row_number()添加一个join键,这实际上非常容易:
select f.EmpId, e.Name, f.Relationship, l.LoanId
from Employee e left join
     (select f.*, row_number() over (partition by f.EmpId order by (select NULL)) as seqnum
      from family f
     ) f 
     on f.empid = e.empid left join
     (select l.*, row_number() over (partition by l.EmpId order by (select NULL)) as seqnum
      from Loan l
     ) l
     on f.EmpId = l.EmpId and f.seqnum = l.seqnum;

请注意,这并不保证为特定员工分配贷款的顺序。您的数据似乎没有足够的信息来处理更一致的分配。

@VladimirBaranov...现在应该可以了。我开始使用错误的表来连接“左连接”链。 - Gordon Linoff

3
以下方法可以轻松地将更多的表“连接”到结果集中,不仅限于两个表。
我将使用表变量来说明解决方案。在实际生活中,这些表当然是真实的表,而不是变量,但为了使这个示例脚本易于运行和尝试,我将坚持使用变量。
declare @TEmployee table (EmpId int, Name varchar(50));
declare @TFamily table (EmpId int, Relationship varchar(50));
declare @TLoan table (EmpId int, LoanId varchar(50));

insert into @TEmployee values (1, 'John');
insert into @TEmployee values (2, 'Lisa');
insert into @TEmployee values (3, 'Mike');

insert into @TFamily values (1, 'Father');
insert into @TFamily values (1, 'Mother');
insert into @TFamily values (1, 'Wife');
insert into @TFamily values (2, 'Husband');
insert into @TFamily values (2, 'Child');

insert into @TLoan values (1, 'L1');
insert into @TLoan values (1, 'L2');
insert into @TLoan values (2, 'L3');
insert into @TLoan values (2, 'L4');
insert into @TLoan values (3, 'L5');

我们需要一个数字表格。
SQL中有一个辅助数字表格的方法,可以参考这里
此外,这个链接这个链接也可以提供一些帮助。
在实际应用中,你会有一个真正的数字表格,但是为了举例,我将使用以下内容:
declare @TNumbers table (Number int);
insert into @TNumbers values (1);
insert into @TNumbers values (2);
insert into @TNumbers values (3);
insert into @TNumbers values (4);
insert into @TNumbers values (5);

我的方法的主要思想是创建一个辅助表,该表首先包含每个EmpId的正确行数,然后使用此表高效地获取结果。
我们将从计算每个EmpId的关系和贷款数量开始:
WITH
CTE_Rows
AS
(
    SELECT Relationships.EmpId, COUNT(*) AS EmpRows
    FROM @TFamily AS Relationships
    GROUP BY Relationships.EmpId

    UNION ALL

    SELECT Loans.EmpId, COUNT(*) AS EmpRows
    FROM @TLoan AS Loans
    GROUP BY Loans.EmpId
)

然后我们计算每个EmpId的最大行数:

,CTE_MaxRows
AS
(
    SELECT
        CTE_Rows.empid
        ,MAX(CTE_Rows.EmpRows) AS MaxEmpRows
    FROM CTE_Rows
    GROUP BY CTE_Rows.empid
)

上面的CTE对于每个EmpId都有一行记录:EmpId本身和该EmpId拥有的关系或借款的最大数量。现在我们需要扩展这张表格,并为每个EmpId生成给定数量的行。在这里,我使用Numbers表格来实现:
,CTE_RowNumbers
AS
(
SELECT
    CTE_MaxRows.empid
    ,Numbers.Number AS rn
FROM
    CTE_MaxRows
    CROSS JOIN @TNumbers AS Numbers
WHERE
    Numbers.Number <= CTE_MaxRows.MaxEmpRows
)

然后我们需要为所有包含数据的表格添加行号,以便稍后用于连接。您可以使用表格中的其他列对行号进行排序。对于本示例,选择不是很多。

,CTE_Relationships
AS
(
    SELECT
        Relationships.EmpId
        ,ROW_NUMBER() OVER (PARTITION BY Relationships.EmpId ORDER BY Relationships.Relationship) AS rn
        ,Relationships.Relationship
    FROM @TFamily AS Relationships
)
,CTE_Loans
AS
(
    SELECT
        Loans.EmpId
        ,ROW_NUMBER() OVER (PARTITION BY Loans.EmpId ORDER BY Loans.LoanId) AS rn
        ,Loans.LoanId
    FROM @TLoan AS Loans
)

现在我们准备将所有这些内容结合在一起。由于CTE_RowNumbers具有我们需要的确切行数,因此简单的LEFT JOIN就足够了:
,CTE_Data
AS
(
    SELECT
        CTE_RowNumbers.empid
        ,CTE_Relationships.Relationship
        ,CTE_Loans.LoanId
    FROM
        CTE_RowNumbers
        LEFT JOIN CTE_Relationships ON CTE_Relationships.EmpId = CTE_RowNumbers.EmpId AND CTE_Relationships.rn = CTE_RowNumbers.rn
        LEFT JOIN CTE_Loans ON CTE_Loans.EmpId = CTE_RowNumbers.EmpId AND CTE_Loans.rn = CTE_RowNumbers.rn
)

我们已经快完成了。主要的Employee表可能有一些没有任何相关数据的EmpIds,例如在您的示例数据中EmpId = 3。为了在结果集中获得这些EmpIds,我将左连接CTE_Data到主表并用破折号替换NULLs:
SELECT
    Employees.EmpId
    ,Employees.Name
    ,ISNULL(CTE_Data.Relationship, '-') AS Relationship
    ,ISNULL(CTE_Data.LoanId, '-') AS LoanId
FROM
    @TEmployee AS Employees
    LEFT JOIN CTE_Data ON CTE_Data.EmpId = Employees.EmpId
ORDER BY Employees.EmpId, Relationship, LoanId;

要获取完整的脚本,只需按照此帖子中出现的顺序将所有代码块放在一起。

这是结果集:

EmpId   Name   Relationship   LoanId
1       John   Father         L1
1       John   Mother         L2
1       John   Wife           -
2       Lisa   Child          L3
2       Lisa   Husband        L4
3       Mike   -              L5

0

Vladimir Baranov已经写了一个不错的解决方案,但它相当长(并且有一个小问题:您想要Husband-L3和Child-L4,但这个解决方案返回Child-L3和Husband-L4)。

Gordon Linoff写了一个更短的解决方案,但它没有正确地工作。

我可以像这样修复Gordon的解决方案:

SELECT e.EmpId, e.Name, f.Relationship, l.LoanId
FROM @TEmployee e
LEFT JOIN (
    SELECT f.*, ROW_NUMBER() OVER (PARTITION BY f.EmpId ORDER BY (SELECT NULL)) AS seqnum
    FROM @TFamily f
) f ON f.empid = e.empid 
LEFT JOIN (
    SELECT l.*, ROW_NUMBER() OVER (PARTITION BY l.EmpId ORDER BY (SELECT NULL)) AS seqnum
    FROM @TLoan l
) l ON l.EmpId = e.EmpId AND (f.seqnum = l.seqnum OR f.seqnum IS NULL)

然而,我更愿意说这个问题是不正确的,因为它要求我们任意地将家庭成员与特定贷款匹配(当实际上不存在真正的关系)。

我更愿意说正确的问题是以下答案:

SELECT e.EmpId, e.Name,
    SUBSTRING((
        SELECT ', '+f.Relationship AS '*'
        FROM @TFamily f
        WHERE f.EmpId=e.EmpId
        FOR XML PATH(''), TYPE
    ).value('.','nvarchar(4000)'),3,4000) AS FamilyMembers,
    SUBSTRING((
        SELECT ', '+l.LoanId AS '*'
        FROM @TLoan l
        WHERE l.EmpId=e.EmpId
        FOR XML PATH(''), TYPE
    ).value('.','nvarchar(4000)'),3,4000) AS Loans
FROM @TEmployee e

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接