question

Pierre-Alain Vigeant avatar image
Pierre-Alain Vigeant asked

Why does Sql Server recommends creating an index when it already exist?

I ran a very basic query against one of our table and I noticed that the execution plan query processor is recommending that we create an index on a column

The query is

SELECT SUM(DATALENGTH(Data))
FROM Item
WHERE Namespace = 'http://some_url/some_namespace/'

After running, I get the following message

// The Query Processor estimates that implementing the following index could improve the query cost by 96.7211%.
CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [dbo].[Item] ([Namespace])

My problem is that I already have such index on that column:

CREATE NONCLUSTERED INDEX [IX_ItemNamespace] ON [dbo].[Item] 
(
    [Namespace] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]

Why is Sql Server recommending me to create such index when it already exist?

Update after Kev Riley answer

I have updated the statistics on the table by running UPDATE STATISTICS.

The Sql Server Query Analyser is now suggesting to INCLUDE a column. This is a syntax that I did not know:

CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [dbo].[Item] ([Namespace])
INCLUDE ([Data])

While I do not want to include Data with that index, I now more understand how the index map against the primary key and that it still need to scan for the other columns values.

sql-server-2008indexingquery-analyzer
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Kev Riley avatar image
Kev Riley answered

Haven't seen that before.

Are the statistics up to date on that index?

What is being used to execute the query, ie what does the execution plan say?

10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

TimothyAWiseman avatar image
TimothyAWiseman answered

I have never seen that exact scenario, but I have seen something fairly similar where it suggested an index on a column I already had indexed, but with included columns that I did not have in my index.

10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Grant Fritchey avatar image
Grant Fritchey answered

Are you sure you're running the query on the same server and database that you're pulling that index from? I've never seen a missing index suggestion on a table that had an index.

10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Blackhawk-17 avatar image
Blackhawk-17 answered

Just remember that SQL Server makes its suggestions for each individual case without looking at your dB as a whole. And totally in isolation from your application.

If you blindly follow the recommendations given you will eventually end up with a very poorly performing system.

Make sure you test the impact of changes and document them so they can be removed if necessary. Indexes are great for efficient queries but they wreak havoc on inserts and updates. You have to balance their usage with performance.

10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Rob Farley avatar image
Rob Farley answered

That index is being suggested because it's massively better that the index that doesn't include the Data column (as a Lookup would be needed). In fact, without the included column, the index may be ignored as the statistics may demonstrate that a table scan could prove quicker than doing a seek with many lookups.

Before it has updated statistics, it sees that an ideal index isn't in place, and figures that it should suggest one. But there is a bug in the system such that it values the statistics most of all, and having a new index (with new statistics) is the thing that it feels would impact it most. Unfortunately the Missing Indexes feature needs to try harder and recognize that even without updated statistics, the best index would include additional columns.

I've blogged about this before, showing how SQL doesn't always suggest the included columns. Missing indexes are useful, but always take your human eye to the recommendations as well.

10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

John Kauffman avatar image
John Kauffman answered

I have had this same problem, several times. sometimes the problem is in the execution plan, sometimes when querying missing index dmv's directly.
For example, querying the missing index stats shows that an index missing on my process_log_header table has had over a cost impact more than 100X the next most costly missing index (see query and output below).

the queries below provide data on missing indices, index usage, and on the columns used in an index.

You'll see that the system sees that an index is missing for process log. No equality columns, 1 inequality column (revision_datetime), and 1 included column (process_log_id).

Given that we have jobs that delete process logs whose revision_datetime values are over 14 days old, this index makes perfect sense. When I saw it, I decided to add the index (even though there's a similar one on revision datetime without the included column. ). In fact, in the second output, you can see I named it ix_missing_revision2, as my second attempt to add the recommended missing index. If you look in the 3rd output, you can see that i created the recommended index - revision_datetime, with process_log_id included.

In the second output, you can see that it has never been used. no seeks, no scans, no lookups. And, it's been around for a while, since there are 28 million updates to the index. We have a manual statistics rebuild process (at microsoft recommendation), and I can see that the stats have been updated many times since this index was created.

However, the stats for the missing index keep rising, and when I look at the execution plan for the query doing the deletes, it recommends this index.

Any help would be appreciated. I've had other odd situations such as: we have an archiving process that tries to delete batches of 25 entries at a time. If it times out, it decreases the size of the batch by 1. For each attempt, it issues a 'SELECT TOP X' query to get the id's to delete. When looking at one of the tables, I have found two oddities in the execution plans . First, the system generates separate plans for each different values for X in the 'select top x'. We had one process that had down-shifted from 25 to 9 entries per batch before it stabilized during a busy period, and I had separate execution plans for 25, 24, 23...9. I guess the top values aren't parameterized, which I was not expecting, but haven't researched yet and doesn't really bother me - just curious. The odd thing, though, is that some of these execution plans recommend a missing index, though others do not - even though the plans are all identical. And, again, this is an index that exists. Even more impressive, when one of my team members added an index hint to try to force it to use the index, it did use the index but it still came back with the missing index recommendation.

In this second case, I would be willing to go with the idea that statistics differences may cause the differences when the plan moved from 25 to 9 entries. that delete process had been running a couple of days, and I haven't looked to see if the indices were out of whack. given the percent of records to be deleted based on the total in the table was not very substantial, though, I'm not sold.

declare @ix_name sysname declare @table_name sysname declare @all_or_unused char(1) declare @use_percent int set @ix_name = 'ix_missing_revision2' set @table_name = 'process_log_header' set @all_or_unused = 'a' --'a' or 'u' set @use_percent = 5 --/* find missing indices*/ SELECT TOP 2 [Total Cost] = ROUND(avg_total_user_cost * avg_user_impact * (user_seeks + user_scans),0) , avg_user_impact , TableName = statement , equality_columns , inequality_columns , included_columns FROM sys.dm_db_missing_index_groups g INNER JOIN sys.dm_db_missing_index_group_stats s ON s.group_handle = g.index_group_handle INNER JOIN sys.dm_db_missing_index_details d ON d.index_handle = g.index_handle ORDER BY [Total Cost] DESC; /* find unused indices*/ SELECT DatabaseName = DB_NAME() , TableName = OBJECT_NAME(s.[object_id]) , IndexName = i.name , round(case when user_updates = 0 then 0 else cast(isnull(user_seeks, 0) + isnull(user_scans, 0) + isnull(user_lookups, 0) as float)/cast(user_updates as float)end, 2) * 100 as use_percentage , user_updates , system_updates , user_seeks , user_scans , user_lookups FROM sys.dm_db_index_usage_stats s JOIN sys.indexes i ON s.[object_id] = i.[object_id] AND s.index_id = i.index_id WHERE s.database_id = DB_ID() AND OBJECTPROPERTY(s.[object_id], 'IsMsShipped') = 0 and (@all_or_unused = 'a' or (case when user_updates = 0 then 0 else cast(isnull(user_seeks, 0) + isnull(user_scans, 0) + isnull(user_lookups, 0) as float)/cast(user_updates as float) end * 100 < @use_percent ) ) AND i.name IS NOT NULL -- Ignore HEAP indexes. and (object_name(s.object_id) = @table_name or @table_name = '*') and (i.name = @ix_name or @ix_name = '*') order by tablename, indexname --order by user_updates desc --/* find all indices and index columns for a given table/index*/ select i.name, object_name(i.object_id), c.name, index_column_id , is_included_column from sys.indexes i join sys.index_columns ic on ic.object_id = i.object_id and ic.index_id = i.index_id join sys.columns c on c.column_id = ic.column_id and c.object_id = ic.object_id where (object_name(i.object_id) = @table_name or @table_name = '*') and (i.name = @ix_name or @ix_name = '*') order by i.name, ic.index_column_id

1st query returns:

Total Cost TableName
Equality InequalityUsage Include Cloumns

575568466770
PROCESS_LOG_HEADER NULL [REVISION_DATETIME] [PROCESS_LOG_ID] 4954129327
SCHEDULE_BATCH
NULL EXPIRATION_DATETIME [SCHEDULE_BATCH_ID], [CUSTOMER_ID]

2nd query returns (forgive the lack of layout):

DatabaseName TableName IndexName use_percentage user_updates system_updates user_seeks user_scans user_lookups TMSPROD PROCESS_LOG_HEADER IX_MISSING_REVISION2 0 28473860 0 0 0 0

final query output:

name (No column name) name index_column_id is_included_column IX_MISSING_REVISION2 PROCESS_LOG_HEADER REVISION_DATETIME 1 0 IX_MISSING_REVISION2 PROCESS_LOG_HEADER PROCESS_LOG_ID 2 1

3 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

obviously, I need to learn how to use the code layout piece. sorry about the illegibility
0 Likes 0 ·
Is this closer to the way you want it formatted?
0 Likes 0 ·
thanks, that's better
0 Likes 0 ·
Cyborg avatar image
Cyborg answered
This is limitation or a bug on Missing Index Feature [Missing Index Feature][1] [1]: http://sqlserver-deep-dives.blogspot.com/2010/09/missing-index-feature-limitation_28.html
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.