How does search with composite index work? How B Tree for composite index work internally? Suppose, we created index as below. Create index Idx_Customer_id_Name on customer (id, name, email) If we run below statements then get seek. Select id,name,email from customer where id = '10' and name = 'Neeraj' Select id,name,email from customer where name = 'Neeraj' and id = '10'-- As mentioned below. Select id,name,email from customer where name = 'Neeraj' -- with additional cost in execution plan and sometime execution plan suggests missing index even after index seek. Please clarify if we create the suggested index even if query seeking the index and also explain how B tree works for composite index.
I guess you already know how B-Tree works. I am not getting into it as it is a huge subject and better would be to search google, wikipedia etc. As far as the search for the composite index is concerned, it is the same as the regular index search. The only difference is that composite index would have multi-value keys to search instead of a single key to search in an order. The sort order of the composite index would be in the column order specified at the creation of the index from left to right. So it is quite important that the order of the columns specified must be according to your data access pattern. In your case the index would be sorted firstly by column id, then by name and then by email. So for instance, if you have following values stored in the customer table id name email 2 Neeraj bla 10 Neeraj yzx 1 abc xyz 3 def zyx Then the order of the index could be id name email 1 abc xyz 2 Neeraj bla 3 def zyx 10 Neeraj yzx From the above, you can see that searching id = 10 and name = 'Neeraj' would be quite easy. Hence in that case, an index seek is performed. In case of name = 'Neeraj', it is evident that in order to find 'Neeraj', the traversing is not simple anymore. Hence, more IO cost would be required to find the required rows. This is why a missing index suggestion would be propmted as an index on the name column would be more beneficiary. As far as the suggested index creation is concerned, it really depends upon your data access patterns and environment. But noticing that id seems to be the primary key and you may already have the clustered index be defined on it. I would rather ignore the suggested index and alter the existing composite index. So the initial ordering would be on the name column and then on the id. Also, having email column in the composite index may not be the best choice. I would rather have it as an included column in the composite index. So you can try something like Create index Idx_Customer_id_Name on customer (name,id) INCLUDE (email) It is only a suggestion against the limited information provided.
The index works of the keys you provide and the order of those keys determines the order of the storage of the index. So your index has id, name, email, in that order. That then determines the layout of the balanced tree. Everything will be ordered first by id, then by name, then by email. When writing t-sql, as you've seen, you don't always have to refer to all three of the columns to get use out of the index. But, you do have to refer to them in order, meaning, you must reference ID so that you can, if needed, reference name and then email (referring to where clauses and join clauses here). If you refer to name and/or email in a query, but no ID, then that index won't get used. This isn't because of the balanced tree though. It's because the statistics and specifically the histogram within the statistics. The histogram, which is one of the ways the optimizer determines if an index is useful, shows the data distribution of the first column of the index, in your example id. So, when you don't refer to id, but instead refer to name, your index isn't used because the optimizer realizes that it wouldn't be useful. All this makes sense when you think about the fact that the keys determine storage order. So if you search for name or email without referencing ID, then the index has to be scanned in order to find values. If you need to search exclusively for those other columns, you need one or more other indexes.