高效地将行转换为列在SQL Server中。

489

我正在寻找一种在SQL Server中将行转换为列的高效方法,我听说PIVOT不是很快,而且我需要处理大量记录。

这是我的示例:

Id Value ColumnName
1 John FirstName
2 2.4 Amount
3 ZH1E4A PostalCode
4 Fork LastName
5 857685 AccountNumber

这是我的结果:

FirstName Amount PostalCode LastName AccountNumber
John 2.4 ZH1E4A Fork 857685
如何构建结果?
5个回答

734

有几种方法可以将多行数据转换成列。

使用PIVOT

在SQL Server中,您可以使用PIVOT函数将数据从行转换为列:

select Firstname, Amount, PostalCode, LastName, AccountNumber
from
(
  select value, columnname
  from yourtable
) d
pivot
(
  max(value)
  for columnname in (Firstname, Amount, PostalCode, LastName, AccountNumber)
) piv;

请查看演示

具有未知数量的columnnames的透视表

如果您有一个未知数量的要转置的columnnames,则可以使用动态SQL:

DECLARE @cols AS NVARCHAR(MAX),
    @query  AS NVARCHAR(MAX)

select @cols = STUFF((SELECT ',' + QUOTENAME(ColumnName) 
                    from yourtable
                    group by ColumnName, id
                    order by id
            FOR XML PATH(''), TYPE
            ).value('.', 'NVARCHAR(MAX)') 
        ,1,1,'')

set @query = N'SELECT ' + @cols + N' from 
             (
                select value, ColumnName
                from yourtable
            ) x
            pivot 
            (
                max(value)
                for ColumnName in (' + @cols + N')
            ) p '

exec sp_executesql @query;

请查看演示

使用聚合函数

如果您不想使用PIVOT函数,则可以使用带有CASE表达式的聚合函数:

select
  max(case when columnname = 'FirstName' then value end) Firstname,
  max(case when columnname = 'Amount' then value end) Amount,
  max(case when columnname = 'PostalCode' then value end) PostalCode,
  max(case when columnname = 'LastName' then value end) LastName,
  max(case when columnname = 'AccountNumber' then value end) AccountNumber
from yourtable

请查看演示

使用多个连接

这也可以使用多个连接完成,但是您需要一些列来关联每行,而在样本数据中您没有这些。但基本语法如下:

select fn.value as FirstName,
  a.value as Amount,
  pc.value as PostalCode,
  ln.value as LastName,
  an.value as AccountNumber
from yourtable fn
left join yourtable a
  on fn.somecol = a.somecol
  and a.columnname = 'Amount'
left join yourtable pc
  on fn.somecol = pc.somecol
  and pc.columnname = 'PostalCode'
left join yourtable ln
  on fn.somecol = ln.somecol
  and ln.columnname = 'LastName'
left join yourtable an
  on fn.somecol = an.somecol
  and an.columnname = 'AccountNumber'
where fn.columnname = 'Firstname'

7
在最后一个例子中,可以使用“cross join”而不是“left join”,因为每个子查询返回一行。 - Gordon Linoff
12
如果你有未知行数的数据,那么你需要使用动态 SQL,但是要注意,转换数百万行的数据可能不够高效。 - Taryn
1
我无法使用视图进行转换,那使用 TFV 呢?这就是为什么我正在寻找一种高效的解决方案。 - tbag
4
动态 SQL 不能在视图或表值函数中使用,必须在存储过程中使用。 - Taryn
1
@hoggar 你需要使用方括号 [First.Last] 来转义点号。 - Taryn
显示剩余21条评论

13
这不仅是一个单一的脚本,而是一种方法,能够为你提供更多的灵活性。
首先,有三个对象:
1. 用户定义的表类型 [ColumnActionList] -> 作为参数保存数据 2. 存储过程 [proc_PivotPrepare] -> 准备我们的数据 3. 存储过程 [proc_PivotExecute] -> 执行脚本
CREATE TYPE [dbo].[ColumnActionList] AS TABLE ( [ID] [smallint] NOT NULL, [ColumnName] nvarchar NOT NULL, [Action] nchar NOT NULL ); GO
    CREATE PROCEDURE [dbo].[proc_PivotPrepare] 
    (
    @DB_Name        nvarchar(128),
    @TableName      nvarchar(128)
    )
    AS
            SELECT @DB_Name = ISNULL(@DB_Name,db_name())
    DECLARE @SQL_Code nvarchar(max)

    DECLARE @MyTab TABLE (ID smallint identity(1,1), [Column_Name] nvarchar(128), [Type] nchar(1), [Set Action SQL] nvarchar(max));

    SELECT @SQL_Code        =   'SELECT [<| SQL_Code |>] = '' '' '
                                        + 'UNION ALL '
                                        + 'SELECT ''----------------------------------------------------------------------------------------------------'' '
                                        + 'UNION ALL '
                                        + 'SELECT ''-----| Declare user defined type [ID] / [ColumnName] / [PivotAction] '' '
                                        + 'UNION ALL '
                                        + 'SELECT ''----------------------------------------------------------------------------------------------------'' '
                                        + 'UNION ALL '
                                        + 'SELECT ''DECLARE @ColumnListWithActions ColumnActionList;'''
                                        + 'UNION ALL '
                                        + 'SELECT ''----------------------------------------------------------------------------------------------------'' '
                                        + 'UNION ALL '
                                        + 'SELECT ''-----| Set [PivotAction] (''''S'''' as default) to select dimentions and values '' '
                                        + 'UNION ALL '
                                        + 'SELECT ''-----|'''
                                        + 'UNION ALL '
                                        + 'SELECT ''-----| ''''S'''' = Stable column || ''''D'''' = Dimention column || ''''V'''' = Value column '' '
                                        + 'UNION ALL '
                                        + 'SELECT ''----------------------------------------------------------------------------------------------------'' '
                                        + 'UNION ALL '
                                        + 'SELECT ''INSERT INTO  @ColumnListWithActions VALUES ('' + CAST( ROW_NUMBER() OVER (ORDER BY [NAME]) as nvarchar(10)) + '', '' + '''''''' + [NAME] + ''''''''+ '', ''''S'''');'''
                                        + 'FROM [' + @DB_Name + '].sys.columns  '
                                        + 'WHERE object_id = object_id(''[' + @DB_Name + ']..[' + @TableName + ']'') '
                                        + 'UNION ALL '
                                        + 'SELECT ''----------------------------------------------------------------------------------------------------'' '
                                        + 'UNION ALL '
                                        + 'SELECT ''-----| Execute sp_PivotExecute with parameters: columns and dimentions and main table name'' '
                                        + 'UNION ALL '
                                        + 'SELECT ''----------------------------------------------------------------------------------------------------'' '
                                        + 'UNION ALL '
                                        + 'SELECT ''EXEC [dbo].[sp_PivotExecute] @ColumnListWithActions, ' + '''''' + @TableName + '''''' + ';'''
                                        + 'UNION ALL '
                                        + 'SELECT ''----------------------------------------------------------------------------------------------------'' '                            
EXECUTE SP_EXECUTESQL @SQL_Code;

GO

CREATE PROCEDURE [dbo].[sp_PivotExecute]
(
@ColumnListWithActions  ColumnActionList ReadOnly
,@TableName                     nvarchar(128)
)
AS


--#######################################################################################################################
--###| Step 1 - Select our user-defined-table-variable into temp table
--#######################################################################################################################

IF OBJECT_ID('tempdb.dbo.#ColumnListWithActions', 'U') IS NOT NULL DROP TABLE #ColumnListWithActions; 
SELECT * INTO #ColumnListWithActions FROM @ColumnListWithActions;

--#######################################################################################################################
--###| Step 2 - Preparing lists of column groups as strings:
--#######################################################################################################################

DECLARE @ColumnName                     nvarchar(128)
DECLARE @Destiny                        nchar(1)

DECLARE @ListOfColumns_Stable           nvarchar(max)
DECLARE @ListOfColumns_Dimension    nvarchar(max)
DECLARE @ListOfColumns_Variable     nvarchar(max)
--############################
--###| Cursor for List of Stable Columns
--############################

DECLARE ColumnListStringCreator_S CURSOR FOR
SELECT      [ColumnName]
FROM        #ColumnListWithActions
WHERE       [Action] = 'S'
OPEN ColumnListStringCreator_S;
FETCH NEXT FROM ColumnListStringCreator_S
INTO @ColumnName
  WHILE @@FETCH_STATUS = 0

   BEGIN
        SELECT @ListOfColumns_Stable = ISNULL(@ListOfColumns_Stable, '') + ' [' + @ColumnName + '] ,';
        FETCH NEXT FROM ColumnListStringCreator_S INTO @ColumnName
   END

CLOSE ColumnListStringCreator_S;
DEALLOCATE ColumnListStringCreator_S;

--############################
--###| Cursor for List of Dimension Columns
--############################

DECLARE ColumnListStringCreator_D CURSOR FOR
SELECT      [ColumnName]
FROM        #ColumnListWithActions
WHERE       [Action] = 'D'
OPEN ColumnListStringCreator_D;
FETCH NEXT FROM ColumnListStringCreator_D
INTO @ColumnName
  WHILE @@FETCH_STATUS = 0

   BEGIN
        SELECT @ListOfColumns_Dimension = ISNULL(@ListOfColumns_Dimension, '') + ' [' + @ColumnName + '] ,';
        FETCH NEXT FROM ColumnListStringCreator_D INTO @ColumnName
   END

CLOSE ColumnListStringCreator_D;
DEALLOCATE ColumnListStringCreator_D;

--############################
--###| Cursor for List of Variable Columns
--############################

DECLARE ColumnListStringCreator_V CURSOR FOR
SELECT      [ColumnName]
FROM        #ColumnListWithActions
WHERE       [Action] = 'V'
OPEN ColumnListStringCreator_V;
FETCH NEXT FROM ColumnListStringCreator_V
INTO @ColumnName
  WHILE @@FETCH_STATUS = 0

   BEGIN
        SELECT @ListOfColumns_Variable = ISNULL(@ListOfColumns_Variable, '') + ' [' + @ColumnName + '] ,';
        FETCH NEXT FROM ColumnListStringCreator_V INTO @ColumnName
   END

CLOSE ColumnListStringCreator_V;
DEALLOCATE ColumnListStringCreator_V;

SELECT @ListOfColumns_Variable      = LEFT(@ListOfColumns_Variable, LEN(@ListOfColumns_Variable) - 1);
SELECT @ListOfColumns_Dimension = LEFT(@ListOfColumns_Dimension, LEN(@ListOfColumns_Dimension) - 1);
SELECT @ListOfColumns_Stable            = LEFT(@ListOfColumns_Stable, LEN(@ListOfColumns_Stable) - 1);

--#######################################################################################################################
--###| Step 3 - Preparing table with all possible connections between Dimension columns excluding NULLs
--#######################################################################################################################
DECLARE @DIM_TAB TABLE ([DIM_ID] smallint, [ColumnName] nvarchar(128))
INSERT INTO @DIM_TAB 
SELECT [DIM_ID] = ROW_NUMBER() OVER(ORDER BY [ColumnName]), [ColumnName] FROM #ColumnListWithActions WHERE [Action] = 'D';

DECLARE @DIM_ID smallint;
SELECT      @DIM_ID = 1;


DECLARE @SQL_Dimentions nvarchar(max);

IF OBJECT_ID('tempdb.dbo.##ALL_Dimentions', 'U') IS NOT NULL DROP TABLE ##ALL_Dimentions; 

SELECT @SQL_Dimentions      = 'SELECT [xxx_ID_xxx] = ROW_NUMBER() OVER (ORDER BY ' + @ListOfColumns_Dimension + '), ' + @ListOfColumns_Dimension
                                            + ' INTO ##ALL_Dimentions '
                                            + ' FROM (SELECT DISTINCT' + @ListOfColumns_Dimension + ' FROM  ' + @TableName
                                            + ' WHERE ' + (SELECT [ColumnName] FROM @DIM_TAB WHERE [DIM_ID] = @DIM_ID) + ' IS NOT NULL ';
                                            SELECT @DIM_ID = @DIM_ID + 1;
            WHILE @DIM_ID <= (SELECT MAX([DIM_ID]) FROM @DIM_TAB)
            BEGIN
            SELECT @SQL_Dimentions = @SQL_Dimentions + 'AND ' + (SELECT [ColumnName] FROM @DIM_TAB WHERE [DIM_ID] = @DIM_ID) +  ' IS NOT NULL ';
            SELECT @DIM_ID = @DIM_ID + 1;
            END

SELECT @SQL_Dimentions   = @SQL_Dimentions + ' )x';

EXECUTE SP_EXECUTESQL  @SQL_Dimentions;

--#######################################################################################################################
--###| Step 4 - Preparing table with all possible connections between Stable columns excluding NULLs
--#######################################################################################################################
DECLARE @StabPos_TAB TABLE ([StabPos_ID] smallint, [ColumnName] nvarchar(128))
INSERT INTO @StabPos_TAB 
SELECT [StabPos_ID] = ROW_NUMBER() OVER(ORDER BY [ColumnName]), [ColumnName] FROM #ColumnListWithActions WHERE [Action] = 'S';

DECLARE @StabPos_ID smallint;
SELECT      @StabPos_ID = 1;


DECLARE @SQL_MainStableColumnTable nvarchar(max);

IF OBJECT_ID('tempdb.dbo.##ALL_StableColumns', 'U') IS NOT NULL DROP TABLE ##ALL_StableColumns; 

SELECT @SQL_MainStableColumnTable       = 'SELECT xxx_ID_xxx = ROW_NUMBER() OVER (ORDER BY ' + @ListOfColumns_Stable + '), ' + @ListOfColumns_Stable
                                            + ' INTO ##ALL_StableColumns '
                                            + ' FROM (SELECT DISTINCT' + @ListOfColumns_Stable + ' FROM  ' + @TableName
                                            + ' WHERE ' + (SELECT [ColumnName] FROM @StabPos_TAB WHERE [StabPos_ID] = @StabPos_ID) + ' IS NOT NULL ';
                                            SELECT @StabPos_ID = @StabPos_ID + 1;
            WHILE @StabPos_ID <= (SELECT MAX([StabPos_ID]) FROM @StabPos_TAB)
            BEGIN
            SELECT @SQL_MainStableColumnTable = @SQL_MainStableColumnTable + 'AND ' + (SELECT [ColumnName] FROM @StabPos_TAB WHERE [StabPos_ID] = @StabPos_ID) +  ' IS NOT NULL ';
            SELECT @StabPos_ID = @StabPos_ID + 1;
            END

SELECT @SQL_MainStableColumnTable    = @SQL_MainStableColumnTable + ' )x';

EXECUTE SP_EXECUTESQL  @SQL_MainStableColumnTable;

--#######################################################################################################################
--###| Step 5 - Preparing table with all options ID
--#######################################################################################################################

DECLARE @FULL_SQL_1 NVARCHAR(MAX)
SELECT @FULL_SQL_1 = ''

DECLARE @i smallint

IF OBJECT_ID('tempdb.dbo.##FinalTab', 'U') IS NOT NULL DROP TABLE ##FinalTab; 

SELECT @FULL_SQL_1 = 'SELECT t.*, dim.[xxx_ID_xxx] '
                                    + ' INTO ##FinalTab '
                                    +   'FROM ' + @TableName + ' t '
                                    +   'JOIN ##ALL_Dimentions dim '
                                    +   'ON t.' + (SELECT [ColumnName] FROM @DIM_TAB WHERE [DIM_ID] = 1) + ' = dim.' + (SELECT [ColumnName] FROM @DIM_TAB WHERE [DIM_ID] = 1);
                                SELECT @i = 2                               
                                WHILE @i <= (SELECT MAX([DIM_ID]) FROM @DIM_TAB)
                                    BEGIN
                                    SELECT @FULL_SQL_1 = @FULL_SQL_1 + ' AND t.' + (SELECT [ColumnName] FROM @DIM_TAB WHERE [DIM_ID] = @i) + ' = dim.' + (SELECT [ColumnName] FROM @DIM_TAB WHERE [DIM_ID] = @i)
                                    SELECT @i = @i +1
                                END
EXECUTE SP_EXECUTESQL @FULL_SQL_1

--#######################################################################################################################
--###| Step 6 - Selecting final data
--#######################################################################################################################
DECLARE @STAB_TAB TABLE ([STAB_ID] smallint, [ColumnName] nvarchar(128))
INSERT INTO @STAB_TAB 
SELECT [STAB_ID] = ROW_NUMBER() OVER(ORDER BY [ColumnName]), [ColumnName]
FROM #ColumnListWithActions WHERE [Action] = 'S';

DECLARE @VAR_TAB TABLE ([VAR_ID] smallint, [ColumnName] nvarchar(128))
INSERT INTO @VAR_TAB 
SELECT [VAR_ID] = ROW_NUMBER() OVER(ORDER BY [ColumnName]), [ColumnName]
FROM #ColumnListWithActions WHERE [Action] = 'V';

DECLARE @y smallint;
DECLARE @x smallint;
DECLARE @z smallint;


DECLARE @FinalCode nvarchar(max)

SELECT @FinalCode = ' SELECT ID1.*'
                                        SELECT @y = 1
                                        WHILE @y <= (SELECT MAX([xxx_ID_xxx]) FROM ##FinalTab)
                                            BEGIN
                                                SELECT @z = 1
                                                WHILE @z <= (SELECT MAX([VAR_ID]) FROM @VAR_TAB)
                                                    BEGIN
                                                        SELECT @FinalCode = @FinalCode +    ', [ID' + CAST((@y) as varchar(10)) + '.' + (SELECT [ColumnName] FROM @VAR_TAB WHERE [VAR_ID] = @z) + '] =  ID' + CAST((@y + 1) as varchar(10)) + '.' + (SELECT [ColumnName] FROM @VAR_TAB WHERE [VAR_ID] = @z)
                                                        SELECT @z = @z + 1
                                                    END
                                                    SELECT @y = @y + 1
                                                END
        SELECT @FinalCode = @FinalCode + 
                                        ' FROM ( SELECT * FROM ##ALL_StableColumns)ID1';
                                        SELECT @y = 1
                                        WHILE @y <= (SELECT MAX([xxx_ID_xxx]) FROM ##FinalTab)
                                        BEGIN
                                            SELECT @x = 1
                                            SELECT @FinalCode = @FinalCode 
                                                                                + ' LEFT JOIN (SELECT ' +  @ListOfColumns_Stable + ' , ' + @ListOfColumns_Variable 
                                                                                + ' FROM ##FinalTab WHERE [xxx_ID_xxx] = ' 
                                                                                + CAST(@y as varchar(10)) + ' )ID' + CAST((@y + 1) as varchar(10))  
                                                                                + ' ON 1 = 1' 
                                                                                WHILE @x <= (SELECT MAX([STAB_ID]) FROM @STAB_TAB)
                                                                                BEGIN
                                                                                    SELECT @FinalCode = @FinalCode + ' AND ID1.' + (SELECT [ColumnName] FROM @STAB_TAB WHERE [STAB_ID] = @x) + ' = ID' + CAST((@y+1) as varchar(10)) + '.' + (SELECT [ColumnName] FROM @STAB_TAB WHERE [STAB_ID] = @x)
                                                                                    SELECT @x = @x +1
                                                                                END
                                            SELECT @y = @y + 1
                                        END

SELECT * FROM ##ALL_Dimentions;
EXECUTE SP_EXECUTESQL @FinalCode;

通过执行第一个查询(通过传递源数据库和表名),您将获得第二个存储过程的预创建执行查询,您所要做的就是定义源列:

  • Stable
  • Value(将用于基于该值聚合值)
  • Dim(您想要使用的列来进行数据透视)

名称和数据类型将自动定义!

我不建议在任何生产环境中使用它,但对于临时的BI请求,它能胜任工作。


一切都很好,直到出现“找不到存储过程'dbo.sp_PivotIt'”的错误。有什么建议吗? - FAB
@DarXyde 抱歉,我混淆了两个版本,请重新编译并重试。 - Bartosz X
谢谢Bartosz,我成功地使用了你脚本中的一些想法并完成了我已经考虑过的东西,但不管怎样,感谢你的更新 :)。我应该想到更改那行代码,但是老实说我认为这是一个存储过程,你忘记在系统中默认设置或类似的东西。当我再次接近那个项目时,我会运行它并在这里更新! - FAB
1
仍然是存储过程名称错误,但现在我知道问题出在哪里了,很容易修复:将 sp_Pivot_Execute 更改为 proc_PivotExecute - FAB
@DarXyde - 感谢您发现了这个问题 - 现在已经修复了。 - Bartosz X
需要一些关于如何使用这个存储过程的建议。最好提供一个示例,演示如何使用表参数格式化sp_PivotExecute存储过程。 - Eclipse

6

我修改了Taryn的答案(“Pivot with unknown number of columnnames”版本),以显示结果中超过1行。这需要有一个额外的“Group”列。

DROP TABLE #yourtable
CREATE table #yourtable
    ([Id] int,[Group] int, [Value] varchar(6), [ColumnName] varchar(13))
;
    
INSERT INTO #yourtable
    ([Id],[Group], [Value], [ColumnName])
VALUES
    (1,1, 'John', 'FirstName'),
    (2,1, '2.4', 'Amount'),
    (3,1, 'ZH1E4A', 'PostalCode'),
    (4,1, 'Fork', 'LastName'),
    (5,1, '857685', 'AccountNumber'),
    (6,2, 'Pedro', 'FirstName'),
    (7,2, '5.1', 'Amount'),
    (8,2, '123456', 'PostalCode'),
    (9,2, 'Torres', 'LastName'),
    (10,2, '857686', 'AccountNumber')
;
;



DECLARE @cols AS NVARCHAR(MAX),
    @query  AS NVARCHAR(MAX)

select @cols = STUFF((SELECT ',' + QUOTENAME(ColumnName) 
                    from #yourtable
                    group by [Group], ColumnName, id
                    having [group] = (SELECT TOP 1 MIN([Group])FROM #yourtable)
                    order by id
            FOR XML PATH(''), TYPE
            ).value('.', 'NVARCHAR(MAX)') 
        ,1,1,'')


        

set @query = N'SELECT ' + @cols + N' from 
             (
                select value, ColumnName,[Group]
                from #yourtable
                GROUP BY [Group],ColumnName,Value
            ) x
            pivot 
            (
                max(value)
                for ColumnName in (' + @cols + N')
            ) p '



exec sp_executesql @query;

enter image description here


2

另一个非常有用的选项是使用CROSS APPLY

-- Original data
SELECT * FROM (VALUES ('1', 1, 2, 3),('2', 11, 22, 33)) AS Stage(id,col1,col2,col3)

-- row to columns using CROSS APPLY
SELECT Stage.id,v.idd, v.colc
FROM (VALUES ('1', 1, 2, 3),('2', 11, 22, 33)) AS Stage(id,col1,col2,col3)
CROSS APPLY (VALUES ('col1', col1),('col2', col2),('col3', col3)) AS v(idd,colc)
GO

我喜欢使用CROSS APPLY的方式,但这似乎是UNPIVOT的替代方法,而不是PIVOT的替代方法。也就是说,你代码示例中的注释-- row to columns using CROSS APPLY实际上应该被标注为-- columns to rows using CROSS APPLY - undefined

0
请尝试。
CREATE TABLE pvt (Present int, [Absent] int);
GO
INSERT INTO pvt VALUES (10,40);
GO
--Unpivot the table.
SELECT Code, Value
FROM 
   (SELECT Present, Absent
   FROM pvt) p
UNPIVOT
   (Value FOR Code IN 
      (Present, [Absent])
)AS unpvt;
GO

DROP TABLE pvt

4
这如何在已被接受的答案基础上进行改进? - Eyeslandic

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接