抱歉,本文较长,但以下是生成和填充测试工具的完整脚本。
我的测试工具包含以下表格:
|--------| |-------------| |-----| |--------------|
|Column | |ColumnValue | |Row | |RowColumnValue|
|--------| |-------------| |-----| |--------------|
|ColumnId| |ColumnValueId| |RowId| |RowId |
|Name | |ColumnId | |Name | |ColumnValueId |
|--------| |Value | |-----| |--------------|
|-------------|
它们代表表格中的行和列。列中单元格的可能值存储在ColumnValue中。行的选定值存储在RowColumnValue中。(我希望这很清楚)
我已经用10个列、10,000行、每列50个列值(500)和每行25个选定列值(250,000)填充了数据。
我有一些动态SQL,它返回所有行,将列进行透视,并包含每列选定列值的XML列表。
注意:出于性能测试目的,我在查询中包装了一个
SELECT COUNT(*)
,以便查询不会通过网络返回大量数据。我的测试工具在大约5-6秒内运行此查询(带有计数)。执行计划显示92%的查询时间花费在对
[ColumnValue].[PK_ColumnValue]
的聚集索引搜索上。客户端统计信息显示客户端处理时间、总执行时间和等待服务器回复时间均为0。我意识到RowColumnValue表中的250k行相当多,我可能对SQL Server期望过高。然而,我的期望是查询应该能够比这快得多。或者至少执行计划应该呈现不同的瓶颈,而不是聚集索引搜索。
有人可以解决问题或给我一些建议,如何使这更有效率吗?
运行透视显示表的动态SQL:
DECLARE @columnDataList NVARCHAR(MAX)
SELECT
@columnDataList =
CAST
(
(
SELECT
', CONVERT(xml, [PVT].[' + [Column].[Name] + ']) [Column.' + [Column].[Name] + ']'
FROM
[Column]
ORDER BY
[Column].[Name]
FOR XML PATH('')
) AS XML
).value('.', 'NVARCHAR(MAX)')
DECLARE @columnPivotList NVARCHAR(MAX)
SELECT
@columnPivotList =
CAST
(
(
SELECT
', [' + [Column].[Name] + ']'
FROM
[Column]
ORDER BY
[Column].[Name]
FOR XML PATH('')
) AS XML
).value('.', 'NVARCHAR(MAX)')
EXEC('
SELECT
COUNT(*)
FROM
(
SELECT
[PVT].[RowId]
' + @columnDataList + '
FROM
(
SELECT
[Row].[RowId],
[Column].[Name] [ColumnName],
[XmlRowColumnValues].[XmlRowColumnValues] [XmlRowColumnValues]
FROM
[Row]
CROSS JOIN
[Column]
CROSS APPLY
(
SELECT
[ColumnValue].[Value] [Value]
FROM
[RowColumnValue]
INNER JOIN
[ColumnValue]
ON
[ColumnValue].[ColumnValueId] = [RowColumnValue].[ColumnValueId]
WHERE
[RowColumnValue].[RowId] = [Row].[RowId]
AND
[ColumnValue].[ColumnId] = [Column].[ColumnId]
FOR XML PATH (''''), ROOT(''Values'')
) [XmlRowColumnValues] ([XmlRowColumnValues])
) [PivotData]
PIVOT
(
MAX([PivotData].[XmlRowColumnValues])
FOR
[ColumnName]
IN
([0]' + @columnPivotList + ')
) PVT
) RowColumnData
')
生成和填充数据库的脚本:
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Row](
[RowId] [int] IDENTITY(1,1) NOT NULL,
[Name] [nvarchar](50) NOT NULL,
CONSTRAINT [PK_Row] PRIMARY KEY CLUSTERED
(
[RowId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Column](
[ColumnId] [int] IDENTITY(1,1) NOT NULL,
[Name] [nvarchar](50) NOT NULL,
CONSTRAINT [PK_Column] PRIMARY KEY CLUSTERED
(
[ColumnId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[RowColumnValue](
[RowId] [int] NOT NULL,
[ColumnValueId] [int] NOT NULL,
CONSTRAINT [PK_RowColumnValue] PRIMARY KEY CLUSTERED
(
[RowId] ASC,
[ColumnValueId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[ColumnValue](
[ColumnValueId] [int] IDENTITY(1,1) NOT NULL,
[ColumnId] [int] NOT NULL,
[Value] [nvarchar](50) NOT NULL,
CONSTRAINT [PK_ColumnValue] PRIMARY KEY CLUSTERED
(
[ColumnValueId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [FKIX_ColumnValue_ColumnId] ON [dbo].[ColumnValue]
(
[ColumnId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
ALTER TABLE [dbo].[ColumnValue] WITH CHECK ADD CONSTRAINT [FK_ColumnValue_Column] FOREIGN KEY([ColumnId])
REFERENCES [dbo].[Column] ([ColumnId])
GO
ALTER TABLE [dbo].[ColumnValue] CHECK CONSTRAINT [FK_ColumnValue_Column]
GO
ALTER TABLE [dbo].[RowColumnValue] WITH CHECK ADD CONSTRAINT [FK_RowColumnValue_ColumnValue] FOREIGN KEY([ColumnValueId])
REFERENCES [dbo].[ColumnValue] ([ColumnValueId])
GO
ALTER TABLE [dbo].[RowColumnValue] CHECK CONSTRAINT [FK_RowColumnValue_ColumnValue]
GO
ALTER TABLE [dbo].[RowColumnValue] WITH CHECK ADD CONSTRAINT [FK_RowColumnValue_Row] FOREIGN KEY([RowId])
REFERENCES [dbo].[Row] ([RowId])
GO
ALTER TABLE [dbo].[RowColumnValue] CHECK CONSTRAINT [FK_RowColumnValue_Row]
GO
DECLARE @columnLoop INT
DECLARE @columnValueLoop INT
DECLARE @rowLoop INT
DECLARE @columnId INT
DECLARE @columnValueId INT
DECLARE @rowId INT
SET @columnLoop = 0
WHILE @columnLoop < 10
BEGIN
INSERT INTO [Column] ([Name]) VALUES(NEWID())
SET @columnId = @@IDENTITY
SET @columnValueLoop = 0
WHILE @columnValueLoop < 50
BEGIN
INSERT INTO [ColumnValue] ([ColumnId], [Value]) VALUES(@columnId, NEWID())
SET @columnValueLoop = @columnValueLoop + 1
END
SET @columnLoop = @columnLoop + 1
END
SET @rowLoop = 0
WHILE @rowLoop < 10000
BEGIN
INSERT INTO [Row] ([Name]) VALUES(NEWID())
SET @rowId = @@IDENTITY
INSERT INTO [RowColumnValue] ([RowId], [ColumnValueId]) SELECT TOP 25 @rowId, [ColumnValueId] FROM [ColumnValue] ORDER BY NEWID()
SET @rowLoop = @rowLoop + 1
END
SELECT * FROM YourTable Where xyz=...
对于你的大型PIVOT和这个查询,哪一个更快呢? - KM.