question

Martin 1 avatar image
Martin 1 asked

How to determine that a column in `sys.system_internals_partition_columns` is the uniquifier in an NCI?

I am writing a query to find tables with any wasted space due to `ALTER TABLE` operations (that might benefit from a rebuild or having `DBCC CLEANTABLE` run on them). This does the following JOIN FROM sys.system_internals_partition_columns pc JOIN sys.partitions p ON p.partition_id = pc.partition_id LEFT JOIN sys.columns c ON column_id = partition_column_id AND c.object_id = p.object_id Originally I had assumed that any columns in `sys.system_internals_partition_columns` that do not have a match in `sys.columns` would be remnants of dropped or altered columns. However one case where this does not apply is for the uniqueifier column. `system_internals_partition_columns` has an `is_uniqueifier` column but this is only set to 1 for clustered indexes. When the uniqueifier is present in the NCI this is zero so currently my query brings back false positives. Is there any way that I can determine that a particular column listed in `sys.system_internals_partition_columns` is the uniquifier in an NCI? **Edit** As Matt points out in his answer the 2 columnids are not guaranteed to be correlated at all. So back to the drawing board on that one. Just to provide a demonstration that this is the case. USE tempdb CREATE TABLE #Foo (A CHAR(65), B CHAR(66), C CHAR(67), D CHAR(68), E CHAR(69)); CREATE CLUSTERED INDEX ixA ON #Foo(A); CREATE NONCLUSTERED INDEX ixB ON #Foo(B) INCLUDE (E); SELECT CASE max_inrow_length WHEN 4 THEN 'uniquifier' ELSE CHAR(max_inrow_length) END AS actual_column_name, column_id, name AS column_name_according_to_join, max_inrow_length, leaf_offset FROM sys.system_internals_partition_columns pc JOIN sys.partitions p ON p.partition_id = pc.partition_id LEFT JOIN sys.columns c ON column_id = partition_column_id AND c.object_id = p.object_id WHERE p.object_id = OBJECT_ID('#Foo') AND index_id > 1 DROP TABLE #Foo Which gives actual_column_name column_id column_name_according_to_join max_inrow_length leaf_offset ------------------ ----------- -------------------------------- ---------------- ----------- B 1 A 66 1 A 3 C 65 67 uniquifier 4 D 4 -1 E 2 B 69 132
sql-server-2008dmv
7 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Kev Riley avatar image Kev Riley ♦♦ commented ·
@Martin, where is your source to say that that field is used - just checked on my instance and I have no values of 1 for `is_uniqueifier`
0 Likes 0 ·
Martin 1 avatar image Martin 1 commented ·
@Kev - Presumably all your CIs are declared as unique then? If you try against msdb you should see some (or at least I do). I don't have any source as that DMV is not documented but as far as I know it is the only way of getting this wasted space info...
0 Likes 0 ·
Kev Riley avatar image Kev Riley ♦♦ commented ·
@martin msdb does, but none of my user tables?
0 Likes 0 ·
Martin 1 avatar image Martin 1 commented ·
@Kev - It will only show up for non unique CIs (which is a practice to be avoided anyway but something I want my query to be able to deal with correctly) `CREATE TABLE Foo (A int,B int); CREATE CLUSTERED INDEX ixA ON Foo(A); CREATE NONCLUSTERED INDEX ixB ON Foo(B)`
0 Likes 0 ·
Kev Riley avatar image Kev Riley ♦♦ commented ·
ahh I was filtering out my results by tablename to avoid the fog - based on the object_id in sys.columns (which is `NULL` DOH!) Sorry!
0 Likes 0 ·
Show more comments
Matt Whitfield avatar image
Matt Whitfield answered
I think you're leading yourself up the garden path with this join: LEFT JOIN sys.columns c ON column_id = partition_column_id AND c.object_id = p.object_id By convention, `partition_column_id ` doesn't relate to the column ID in the table, but, in face, to the ID of the column within the partition. You can see this same convention in the table `[sys].[index_columns]` - `index_column_id` represents a unique ID for the column within the index, rather than a reference to the column ID itself. I think you'd be better off inferring a uniquifier status from the other columns. As to how you might do that, I can only guess. -1 for `leaf_offset` looks like a good start. But, given the fact that I doubt the fact that the initial join to `sys.columns` is valid anyway, perhaps you might want to take a different route?
9 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Kev Riley avatar image Kev Riley ♦♦ commented ·
This is where I was struggling to 'understand' the query. The lack of documentation on sys.system_internals_partition_columns doesn't help, but given it is an internal view for internal use only, I guess that's the way it goes......
1 Like 1 ·
Matt Whitfield avatar image Matt Whitfield ♦♦ commented ·
Just to be super-clear: By adding C int to the original table definition you can clearly see that it's joining the uniquifier to column C in error. Column C has absolutely nothing to do with the NCI, and it's presence alone proves beyond reasonable doubt that the join is flawed.
1 Like 1 ·
Martin 1 avatar image Martin 1 commented ·
The DMV isn't documented but the Join on `column_id = partition_column_id` [seems to work fine and this is what Kalen Delaney uses here]( http://sqlblog.com/blogs/kalen_delaney/archive/2006/10/13/alter-table-will-not-reclaim-space.aspx)
0 Likes 0 ·
Matt Whitfield avatar image Matt Whitfield ♦♦ commented ·
Ok, try your initial script, but change the initial definition of the table to CREATE TABLE Foo (A int, B int, C int); Have a look at the result. Don't believe everything you read on the internet.
0 Likes 0 ·
Martin 1 avatar image Martin 1 commented ·
Yes you're right. Bummer! This is hardly from an unknown/untrusted source though. Kalen Delaney wrote the book on SQL Server 2008 internals and as far as I remember that Join is in there as well.
0 Likes 0 ·
Show more comments
Mark S Rasmussen avatar image
Mark S Rasmussen answered
While working on my parsing of NCI records I've attempted to identify the column ordinal of the uniquifier in nonclustered indexes as well. My current conclusion is that the uniquifier is the only integer (system_type_id = 56) that is stored in the variable length section of a record, thus any column with system_type_id = 56 AND leaf_offset < 0 is bound to be the uniquifier. Should that not be the case, I've also devised a query to filter out the explicitly included columns in the NCI, leaving just the implicitly included ones that make up the clustered key - including the uniquifier. Correlating that with the columns that are part of the clustered key, we can use the power of deduction to find the uniquifier as well. I've just blogged about the topic in somewhat more detail: http://improve.dk/archive/2011/07/06/determining-the-uniquifier-column-ordinal-for-clustered-and-nonclustered-indexes.aspx
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.