question

Wilfred van Dijk avatar image
Wilfred van Dijk asked

Columns from clustered index not necessary in non-clustered index?

Is this thought correct: It is not necessary to add columns to a nonclustered index, if that column is also part of the clustered index. This because the columns from a clustered index are always added to a nonclustered index. (and also if the column is an included column)
administrationindexestuning
1 comment
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Update: for non-clustered indexes, exclude indexes with [is_unique], [is_unique_constraint] or [is_primary_key] set to 1
0 Likes 0 ·
Cyborg avatar image
Cyborg answered
Yes, this is true.
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Kev Riley avatar image
Kev Riley answered
Yes, whilst this is true, it doesn't mean that for some scenarios, having a non-clustered index with duplicated columns might not be a better (performance-wise) option. As ever - it depends!
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Wilfred van Dijk avatar image
Wilfred van Dijk answered
"It depends" .. I knew someone was saying that! :) I tried this: create table persoon(id int identity(1,1) primary key, naam varchar(32), postcode varchar(6), huisnummer varchar(16), geboortedatum date) go create index idx01 on dbo.persoon(postcode) create index idx11 on dbo.persoon(id, postcode) truncate table persoon go declare @i int set @i = 0 while @i < 5000 begin insert into persoon select replace(newid(), '-', ''), right('4567' + cast(abs(checksum(newid()))%10000 as varchar),4) + 'XX', abs(checksum(newid()))%100, dateadd(day, abs(checksum(newid()))%10000, '1960-01-01') set @i += 1 end select id from persoon where postcode = '6742XX' And after playing with several postcode values, the optimizer is always using idx01 to return the column [id]. It never used idx11 ...
3 comments
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

idx02 will not be used here to retrieve id column because the index columns are "ID and Postcode", meaning the values are sorted on "id and postcode", so the optimizer cannot use this index too search(Seek) postcode. If optimizer uses idx02 to search postcode, this will result in index scan(touches all index pages) and this is costly compare to index seek. This can be tested by using index hint select id from persoon WITH (index = idx02) where postcode = '6742XX'
1 Like 1 ·
What is the size of your sample? How many rows?
0 Likes 0 ·
5000 rows: (while @i < 5000)
0 Likes 0 ·
Wilfred van Dijk avatar image
Wilfred van Dijk answered
If I change the indexes to: create index idx01 on dbo.persoon(postcode) create index idx11 on dbo.persoon(postcode, id) and use the index hint: select id from persoon with (index = idx01) where postcode = '6929XX' select id from persoon with (index = idx11) where postcode = '6929XX' (postal code exists in table), the result for the execution plan is the same (index seek, same I/O and CPU cost, no lookup in clustered index). So where's the "it depends"?
4 comments
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

There's no clustered index.
0 Likes 0 ·
There is: see my first posting: "Id (identity(1,1) primary key" this creates a clustered index.
0 Likes 0 ·
Check out other discussions for where it might matter : .
0 Likes 0 ·
I need more coffee...
0 Likes 0 ·
Blackhawk-17 avatar image
Blackhawk-17 answered
Your search predicate is on postcode so the optimizer will use the index where that column is the first one defined. When you added ID to the index the optimizer *sees* that as a better match based on predicates and required results. They will produce the same results. When you use hints and force the index you will get a difference in performance. On SQL Server 2008 R2 idx01 - dbo.persoon(postcode) was 16% of the batch while idx11 - dbo.persoon(id, postcode) was 84%. With a third index of dbo.persoon(postcode, id) while it performs the same as idx01 the optimizer is more likely to choose it for this query. Having said all that, this is a relatively simple query... we may see different results on a composite clustered index or with mega-data, or (most likely) where a uniquifier was added by SQL Server to create the clustered index because of duplicates.
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.