Should my primary key be clustered?

The default in Management Studio is to make the primary key clustered, but is this a good idea? What are the pros and cons of making the primary key clustered v nonclustered?

more ▼

asked Oct 09, 2009 at 12:00 PM in Default

avatar image

Steve Jones - Editor ♦♦
5.2k 79 94 87

(comments are locked)
10|1200 characters needed characters left

7 answers: sort voted first

The classic DBA answer - it depends.

Strictly speaking every table should have both a PK and a clustered index, but not always necessarily using the same field.

If you are using GUIDs (shudder) as your PK, then most definitely no! The fragementation can become horrendous.

If using identity, then combining the PK and clustered won't do you much harm.

If using some other natural/surrogate key then you will have to make an informed decision depending on the values in those keys, if they are as random as GUIDs, then you'll have the same issues as GUIDs.

more ▼

answered Oct 11, 2009 at 06:41 AM

avatar image

Kev Riley ♦♦
66.8k 48 65 81

(comments are locked)
10|1200 characters needed characters left

Yes. A Primary key is merely a unique key that is intended to be the default one used to select rows and perform joins. It is the default candidate key for a table. They should be used as there is no other way to specify to other processes which key should be used by default.

A Primary key is no more than an indexed key whose value, taken together across all component columns, is unique and known. SQL Server creates a clustered unique index when you specify a primary key, but this is not always the best solution, since a clustered index works best when it is not unique, not null and when the values are reasonably evenly distributed The clustered index can be explained by my poem...

An index is used as a short-cut to data

a table will warrant one sooner or later

Because only one can be clustered, beware

and ponder the index you cluster with care

the issues are clearer than you might suppose

this index determines the order of rows

so searching the index requires less I/O.

Selecting the column on which it should go

depends on the way that the rows are selected,

which should become clear if the Schema's inspected.

One problem, however, I think you should know,

retrieving a range can be horribly slow.

A non-clustered index is almost as good

once ordering keys can be well understood

make sure that the columns you use are selective

for if too few values, it's most ineffective

if data is changing or updating too

with frequent insertions, keep indexes few.

from 2000 on you can index a view

(but then there's restrictions on what you can do)

and even on computed columns as well

but only if deterministic as hell

For reasons which often are misunderstood

a non-clustered covering index is good

when composite columns are used with some care

they outperform anything else that's out there

more ▼

answered Oct 09, 2009 at 02:01 PM

avatar image

Phil Factor
4.2k 8 27 21

(comments are locked)
10|1200 characters needed characters left

The two are really different concepts - the primary key is the data that uniquely identifies a record, whereas a clustered index should really match the keys with which the record is most frequently accessed. A typical example might be transactions and items. Each table might have a primary key which is an IDENTITY field, but the transaction items would most likely be looked up by the Transaction ID (the foreign key from the items table to the transactions table).

In that case, it would be most efficient to have a clustered index on the items table on the Transaction ID field, as that is how the items would be retrieved most of the time.

more ▼

answered Oct 09, 2009 at 12:15 PM

avatar image

Matt Whitfield ♦♦
29.5k 62 66 88

(comments are locked)
10|1200 characters needed characters left

I'd mildly disagree with Matt's answer. He's not necessarily wrong, but with the items table putting the clustered index on TransactionID might cause fragmentation because you can't be guaranteed that the items will be inserted in order of TransactionID meaning there will be more page splits.

Also a non-clustered index on TransactionID in the items table would suffice for joining to the Transactions table.

more ▼

answered Oct 09, 2009 at 06:15 PM

avatar image

Jack Corbett
1.1k 3 4 7

(comments are locked)
10|1200 characters needed characters left

Most of the time, yes. Since the cluster is the primary avenue of data access inside SQL Server and in most cases, when you're pulling from a particular table you're going first to the primary key, it makes the most sense to have that as the cluster as well. However, other circumstances can lead to different conclusions. If, for example, the most frequently access path, the column(s) used most often to retrieve data from the table was a foreign key, to eliminate the key lookups necessary to retrieve the data from the table, it might make more sense to make the foreign key the cluster.

In my opinion, data access paths define where the cluster should be.

more ▼

answered Oct 10, 2009 at 08:22 AM

avatar image

Grant Fritchey ♦♦
137k 20 47 81

(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here



Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.



asked: Oct 09, 2009 at 12:00 PM

Seen: 27823 times

Last Updated: Oct 15, 2009 at 11:02 AM

Copyright 2018 Redgate Software. Privacy Policy