question

Tatyana avatar image
Tatyana asked

How not to lose a clustered index advantage

I'm apologizing for an obscure question. Here is the situation.

I have a large table, lets call it Order, with a PK nOrder. Each order is owned by only one customer, and their connection is stored in a cross-reference table OrderCustomer that has its own PK, and indexes nOrder and nCustomer. Customers' data is stored in Customer table, and so on. Pretty standard schema, right? However, is the cross reference table (OrderCustomer) really needed? Why don't we put nCustomer field directly into Order table and save the trip? And so I did, and now when I need to pull a list of order with all the details, including details on customers, their locations, and what not, I have one inner join less. Compare: 1. Select [list of fields] From Order O Inner Join OrderCustomer OCU ON O.nOrder = OCU.nOrder Inner Join Customer C ON C.nCustomer = OCU.nCustomer. 2. Select [list of fields] From Order O Inner Join Customer C ON C.nCustomer = O.nCustomer.

Was there a visible performance improvement #2 over #1? NO.The first query takes 3% of time compare to the second 97%: The long list of fields from Order table was by default included in the nOrder index used in the join, and with now join on nCustomer the plan shows key lookup, instead of clustered index seek.

I know that I need to create a covering index on nCustomer field with all the fields in the output included, but wouldn't this be a duplication of already existing index? Is there a way to still take advantage of the clustered key?

indexclustered-indexinner-join
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Kev Riley avatar image
Kev Riley answered

The original schema design suggests that it was possible for an order to be linked to more than one customer (yes does seem odd, but that's the only reason for having a link table between Orders and Customers)

There's no need to create a covering index, like you say it would be an almost duplication of the clustered index, and likely as large, so there's no real need. Just have an index on nCustomer on the Orders table. It will be small compared to the size of the table. Could also have a unique constraint on (nOrder, nCustomer) which would enforce the 'one order - one customer' rule.

When you say you are concerned about losing the 'advantage' of the clustered index, what advantage is it that you perceive is being lost?

10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Tatyana avatar image
Tatyana answered

@Kev Riley thank you for your answer! You're right: there was a plan to have more than one customer per order. This never happened; but each query pays for this design with a trip to OrderCusotmer table (plan 1):

Plan 2 is the same query, but now nCustomer field is added to the Order table, and there is an Index on this field with nOrder included column (IDX_Order_27). Apparently, clustered index on nOrder is not efficient.

Plan 3 is how the same query looks when there is an index on nCustomer field with some 30 columns included, which, again, just duplicates the CI.

This is what I meant by "losing a clustered index advantage'. Is there any work around this?


plans.jpg (46.7 KiB)
4 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Kev Riley avatar image Kev Riley ♦♦ commented ·

For me I'd prefer Plan 2. There's no inherent harm with having key lookups. What are the yellow warnings on Plan1 and Plan3 telling you?

0 Likes 0 ·
Tatyana avatar image Tatyana commented ·

Yellow warning are missing statistics that by some reason don't go away immediately after the statistics were added. The execution for the plan 2, however, takes several times longer than the old one. And it's a critical SP that populates one of the most frequently used screens in the app, so I cannot mess with it, - and the execution time's difference (when it uses plan1 VS plan 2) is just shocking:

Basically, I had to back out of the idea of merging OrderCustomer table with Order table... (:

0 Likes 0 ·
ex-plan-cropped.png (21.2 KiB)
Kev Riley avatar image Kev Riley ♦♦ commented ·

Are you sure? The difference in execution time there is 100ms on the Duration and 17ms on CPU. Don't look at the "Est Cost %" - that is just an estimate - look at the 'actual' metrics

0 Likes 0 ·
Tatyana avatar image Tatyana Kev Riley ♦♦ commented ·

Yes, Kev, you were right; the actual metrics - e.g. duration, writes/reads in the profiler, when the two versions of the query were running side by side, - didn't show much difference between them. Thank you very much for your help!

0 Likes 0 ·

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.