主键标识符是否应始终为非聚集索引？

Question

主键标识符是否应始终为非聚集索引？

3

假设有一张带有自增标识的表，例如：

CREATE TABLE [dbo].[Category]
(
    [CategoryId] INT IDENTITY(1,1) NOT NULL PRIMARY KEY,
    [Title] NVARCHAR(50) NOT NULL,
    [Description] NVARCHAR(128) NULL
)

自增主键使表本质上有序，这是否意味着我应该将CategoryId设为非聚集索引，然后在其他两列上创建聚集索引CREATE CLUSTERED INDEX [IX_Category_Column] ON [dbo].[JobCategory] ([Title], [Description])？

- Arnold Zahrneinder

3个回答

2

集群索引(Clustered Index)指的是表中所有数据按照这种索引进行排序。当您创建PrimaryKey(主键)时，就创建了该集群索引。每个表只能有一个集群索引。
因此，根据您将针对表运行的查询，可以在其他两列上创建非聚集索引(Nonclustered Index)。
还要注意，集群索引应尽可能窄。原因是它包含在所有其他索引中。因此，当您在Title列上创建索引时，即使您没有指定它，它也将包含CategoryId列。
在创建索引时，您还应考虑另一个方面。列可以是索引的一部分，也可以仅是“包括在内”的。这意味着它包含在索引中，但数据不使用此列进行排序。当您想要在索引中使用列时，这可能会很方便，而且您不会在where子句或join中使用它，但它将是您查询的输出。特别是当此列中的数据经常更改时。
让我们以您的表为例，并向其中添加一些数据。

insert into Category (Title, Description) values ('Title2', 'Description2_B')
insert into Category (Title, Description) values ('Title2', 'Description2_A')
insert into Category (Title, Description) values ('Title1', 'Description1_B')
insert into Category (Title, Description) values ('Title1', 'Description1_A')

现在在“标题”和“描述”列上创建索引。

create nonclustered index idx_category_title on Category (title, Description)

在这个表上运行 select 命令将会给你：

select Title, Description from category where title Like 'Title%'

Results:
    |  Title |    Description |
    |--------|----------------|
    | Title1 | Description1_A |
    | Title1 | Description1_B |
    | Title2 | Description2_A |
    | Title2 | Description2_B |

正如您所看到的，结果首先按标题排序，然后按描述排序。使用此索引，每次修改描述时，您的索引都必须更新以使数据排序。

现在让我们尝试相同的表格和相同的数据，但使用“包含”列作为索引。

create nonclustered index idx_category_title on Category (title) include (Description)

在这个设置下运行相同的选择操作将会给出以下结果：

select Title, Description from category where title Like 'Title%'

Results:
    |  Title |    Description |
    |--------|----------------|
    | Title1 | Description1_B |
    | Title1 | Description1_A |
    | Title2 | Description2_B |
    | Title2 | Description2_A |

正如您所看到的，数据按标题排序，但不按描述排序。真正的性能提升在于修改描述。由于索引未使用此列进行排序，因此更改它不会更改索引中记录的位置。

- Marek Vitek

谢谢。"在标题列上建立索引，并将描述放入包含的列中"，如果您不介意，您能否用代码解释一下您的意思 :) - Arnold Zahrneinder

请查看更新的答案，其中包括“已包含”列的部分。 - Marek Vitek

1

自动增量主键使表固有地排序，这是不正确的。聚簇键才是决定行在存储时如何排序的。

通常，聚集键应该是狭窄的（例如int或bigint），并且是递增的（如identity()）。

您还应考虑为约束和索引提供更明智的名称，而不是接受SQL服务器自动生成的任何名称。

create table dbo.Category (
    CategoryId int identity(1,1) not null
  , Title nvarchar(50) not null
  , Description nvarchar(128) null
  , constraint pk_Category_CategoryId primary key clustered (CategoryId)
);

支持类似以下的查询：

select CategoryId, Title, Description 
from dbo.Category
where Title = 'MyTitle';

你需要创建一个额外的非聚集索引，如下所示：

create nonclustered index ix_Category_Title 
  on dbo.Category (Title)
    include (Description);

参考资料：

- SqlZim

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Rahul · Accepted Answer

PRIMARY KEY 默认会在该PK列上创建一个聚集索引(clustered index)，物理排序顺序是由该索引维护的，而不是IDENTITY列本身。

因此，如果需要的话，您应该在其他列上创建一个非聚簇索引(non-clustered index)。

如果您打算在筛选条件或关联条件中使用这些列，则在其他列上创建索引肯定有助于提高查询性能。在这种情况下，它会执行索引扫描(index scan)而不是表扫描(table scan)。为了体验这一点，请运行涉及这些条件的查询，并分别在其中添加和不添加索引。获取实际的查询执行计划并自行查看。