How to determine that a column in `sys.system_internals_partition_columns` is the uniquifier in an NCI?
I am writing a query to find tables with any wasted space due to `ALTER TABLE` operations (that might benefit from a rebuild or having `DBCC CLEANTABLE` run on them). This does the following JOIN FROM sys.system_internals_partition_columns pc JOIN sys.partitions p ON p.partition_id = pc.partition_id LEFT JOIN sys.columns c ON column_id = partition_column_id AND c.object_id = p.object_id Originally I had assumed that any columns in `sys.system_internals_partition_columns` that do not have a match in `sys.columns` would be remnants of dropped or altered columns. However one case where this does not apply is for the uniqueifier column. `system_internals_partition_columns` has an `is_uniqueifier` column but this is only set to 1 for clustered indexes. When the uniqueifier is present in the NCI this is zero so currently my query brings back false positives. Is there any way that I can determine that a particular column listed in `sys.system_internals_partition_columns` is the uniquifier in an NCI? **Edit** As Matt points out in his answer the 2 columnids are not guaranteed to be correlated at all. So back to the drawing board on that one. Just to provide a demonstration that this is the case. USE tempdb CREATE TABLE #Foo (A CHAR(65), B CHAR(66), C CHAR(67), D CHAR(68), E CHAR(69)); CREATE CLUSTERED INDEX ixA ON #Foo(A); CREATE NONCLUSTERED INDEX ixB ON #Foo(B) INCLUDE (E); SELECT CASE max_inrow_length WHEN 4 THEN 'uniquifier' ELSE CHAR(max_inrow_length) END AS actual_column_name, column_id, name AS column_name_according_to_join, max_inrow_length, leaf_offset FROM sys.system_internals_partition_columns pc JOIN sys.partitions p ON p.partition_id = pc.partition_id LEFT JOIN sys.columns c ON column_id = partition_column_id AND c.object_id = p.object_id WHERE p.object_id = OBJECT_ID('#Foo') AND index_id > 1 DROP TABLE #Foo Which gives actual_column_name column_id column_name_according_to_join max_inrow_length leaf_offset ------------------ ----------- -------------------------------- ---------------- ----------- B 1 A 66 1 A 3 C 65 67 uniquifier 4 D 4 -1 E 2 B 69 132
I think you're leading yourself up the garden path with this join: LEFT JOIN sys.columns c ON column_id = partition_column_id AND c.object_id = p.object_id By convention, `partition_column_id ` doesn't relate to the column ID in the table, but, in face, to the ID of the column within the partition. You can see this same convention in the table `[sys].[index_columns]` - `index_column_id` represents a unique ID for the column within the index, rather than a reference to the column ID itself. I think you'd be better off inferring a uniquifier status from the other columns. As to how you might do that, I can only guess. -1 for `leaf_offset` looks like a good start. But, given the fact that I doubt the fact that the initial join to `sys.columns` is valid anyway, perhaps you might want to take a different route?
While working on my parsing of NCI records I've attempted to identify the column ordinal of the uniquifier in nonclustered indexes as well. My current conclusion is that the uniquifier is the only integer (system_type_id = 56) that is stored in the variable length section of a record, thus any column with system_type_id = 56 AND leaf_offset < 0 is bound to be the uniquifier. Should that not be the case, I've also devised a query to filter out the explicitly included columns in the NCI, leaving just the implicitly included ones that make up the clustered key - including the uniquifier. Correlating that with the columns that are part of the clustered key, we can use the power of deduction to find the uniquifier as well. I've just blogged about the topic in somewhat more detail: