question

Fatherjack avatar image
Fatherjack asked

Indexing a GUID

After a lot of research I am almost to the point of adding an index on a table to improve reads. The column that is being requested 00's of times a day is based on a GUID value. eg SELECT ColA from TableA where ColGUID = @GUIDVal There is only a CLUSTERED PK on the table and that is on the ID (Identity) column. If I create an index based on the GUID column will I run into issues or doing something unnecessary if I create it as a UNIQUE index? This is a 3rd party db that we are allowed to make some (approved) changes in, eg index management.
indexingprimary-keyunique
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Grant Fritchey avatar image
Grant Fritchey answered
I think everyone has already answered this question, but I'll pile on. GUID's are fine for indexes. In fact, they're pretty good for indexes, being highly selective & all. The problem with GUID's comes up when you make them into a clustered index. That perfect, random, distribution leads to incredible page splits & fragmentation, not to mention having to maintain that big fat key on every other index for the table. But, when you use the sequential GUID, you largely alleviate that problem. They're not my favorite choice for indexes, but there's nothing inherently evil about them.
11 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Grant Fritchey avatar image Grant Fritchey ♦♦ commented ·
If the data is going to be unique, yeah, I'd specify it that way. The optimizer will take more advantage of it that way.
5 Likes 5 ·
Grant Fritchey avatar image Grant Fritchey ♦♦ commented ·
Unless I'm convinced that there's going to be too big a performance hit on the inserts, yeah, I'd say I default to making it unique if I know that it's going to be or supposed to be. The nice thing about a unique index, it's pretty good at finding if the value already exists, so the cost is usually pretty low.
1 Like 1 ·
Fatherjack avatar image Fatherjack ♦♦ commented ·
should I specify the index as UNIQUE or not? does it matter as the data is so selective would the stats take care of it as much as the unique declaration?
0 Likes 0 ·
Matt Whitfield avatar image Matt Whitfield ♦♦ commented ·
@Grant - I'm glad you cleared that one up. Would you say as a rule of thumb that you should specify UNIQUE on indexes over data that will be unique anyway, and that the overhead on writes would be repaid by efficiency on reads? (Obviously assuming a balanced workload...)
0 Likes 0 ·
WilliamD avatar image WilliamD commented ·
I have to ask you this then: wouldn't a unique constraint be better in the case of wanting to denote a column as being unique? As I understand it, both a unique constraint and index are physically the same, just from a DB design standpoint a constraint is a better definition for the table (sort of self-documented uniqeness for the column).
0 Likes 0 ·
Show more comments
Matt Whitfield avatar image
Matt Whitfield answered
If you're just doing it for reads for that specific query, then I wouldn't worry about adding it as a UNIQUE index, just to save cost on insert time. Other queries might benefit from it being a unique index, however (for example joins - it might choose a different match type knowing that the index was unique - Grant will know more though). Is the GUID a client generated GUID, or is it NEWID or NEWSEQUENTIALID? If it was NEWSEQUENTIALID I would use a fairly high (high 90s) fill factor. If client generated or NEWID, then I would use a lower fill factor (maybe mid 70s). But that's something I would monitor and tune over time.
4 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Oleg avatar image Oleg commented ·
@Jonlee Lockwood No, guids are quaranteed to be unique no matter which way was used to generate them. This is simply because every NIC card has a unique ID due to the agreement between network cards manufacturers. The sequentiality(or whatever the word should be) of generated GUIDs is a different story. NEWSEQUENTIALID starts fresh after restart and then next requests generate sequential GUIDs, meaning less page splits (until next reboot). The chain will be broken after next restart, but the uniqueness - still there. The next requests will still generate sequential GUIDs, which may be even worse than if they were random, depending where that initial seed picks up.
3 Likes 3 ·
WilliamD avatar image WilliamD commented ·
+1 for the fillfactor and NEWSEQUENTIALID points.
2 Likes 2 ·
Fatherjack avatar image Fatherjack ♦♦ commented ·
I would estimate the client is generating the GUID as the column type is varchar(255).
0 Likes 0 ·
sp_lock avatar image sp_lock commented ·
@Matt... I was under the impression that NEWSEQUENTIALID is only guaranteed to be unique since the last windows restart? +1 for te fill factor.. I always refer to Paul/Kimberly on the guid research.. http://www.sqlskills.com/BLOGS/PAUL/post/Clustered-or-nonclustered-index-on-a-random-GUID.aspx
0 Likes 0 ·
WilliamD avatar image
WilliamD answered
I can't imagine there being any problems with the index being unique. A GUID is a unique value after all. Although, reading on the web, I found info that GUIDs can collide, it is just extremely rare - someone mentioned a collision in SQL 2000, but it was shrugged off as a bug that was fixed in SQL 2005 ([link][1]) I would go for the unique index. Is this system mission critical? If you are a lottery winner and hit a GUID collision, will your app explode on a failed insert? [1]: http://weblogs.asp.net/nannettethacker/archive/2008/09/03/guids-unique-or-not.aspx
1 comment
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Fatherjack avatar image Fatherjack ♦♦ commented ·
I think its the memory of hearing about the possibly of getting duplicate GUIDs that has made me ask this question! I would hope that the app would handle the collision as an insert error and retry but wont know until after the index is implemented. I am also wondering if the statistics on the column would have a similar (enough) effect as making it UNIQUE so meaning there isnt enough potential benefit to risk the lottery win
0 Likes 0 ·

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.