Indexes on hierarchical database running slow

Hi guys. I'm pretty noob on sql server tuning and maintenance. Hope you guys could help me with an issue that is causing some troubles on our application.

My company is running an hierarchical SQL Server 2008 R2 database with a TPT model. We got entities with a structure such as Entity -> Concept -> Operation -> Transaction -> Sale, where all of this tables are sharing only de Id from Entity with a FK.

Our performance problems came when we started to get millions of operations belonging to a small group of "scopes". Every Concept has a scope_Id and we have like 6 or 7 scopes on the database (a scope is representing an organization that operates inside the application), being just 3 of them the Big Stack Bullies, like:

Scope 1: 20M of operations Scope 2: 30M of operations Scope 3: 25M of operations Others: A few thousand operations on average

On the other hand, an operation has a sendDate. This column has a non-clustered index ordered descending. The scope_Id column on Concept has a similar index.

Then the most common queries on transactions have a where clause containing both columns. It uses to be fast if the query only asks by sendDate, but when it asks by sendDate and scope_Id, or just by scope_Id, is so damn slow and even produces disk queue depths making sort operations on tempDB.

I'm kinda sure that index on scope_Id is useless since just 3 entries contains a 90% of all records. I'm thinking about create an index on scope_Id and a datetime column similar to sendDate, but I'm not sure about this and can't simulate an environment like our production database due to its size and because is on a RDS instance from Amazon Web Services, so I can't get a bak file or generate a script big enough.

Any suggestion about how to improve queries on this situation will be so much appreciated. Thanks in advance.

more ▼

asked Mar 23, 2016 at 10:07 AM in Default

avatar image

11 3

(comments are locked)
10|1200 characters needed characters left

1 answer: sort voted first

Sounds like the skew of data could be a problem here - so for example a query against Scope2 is going to potentially read millions of rows, whereas the same query against ScopeX is a lot less. Estimating the number of rows a query will process is a key part of producing an execution plan, and SQL can choose to handle larger volumes differently than smaller ones. If these are being passed as parameters to stored procedures, then you could be getting sub-optimal execution plans due to parameter sniffing. Spilling out to tempdb is a common side-affect of a bad plan.

Is there a difference in query performance when you query for the Big Stack Bullies, vs querying against the smaller scopes?

The kinds of things I would be looking at are statement level recompiles, filtered indexes, filtered stats, partitioning - these may or may not help. If you can share an execution plan here for a badly performing query, we can suggest better alternatives.

more ▼

answered Mar 23, 2016 at 11:41 AM

avatar image

Kev Riley ♦♦
66.8k 48 65 81

Hi Kev, thanks for your answer.

Gonna ask for a permission to share an execution plan (sic). My bosses are a kinda careful about sharing info and that could be a problem for me. I know without that is like looking for a needle in a haystack, so I'm sorry.

Answering your question yeah, there's a sustantial difference in querying smaller scopes, but it's still so slow.

The thing is we have some limitations on querying cause the sql statement is auto-generated by the app using an intermediate language. It always selects all columns for all tables until reach the hierarchy level that is being queried.

I've noticed SQL Optimizer is sometimes ignoring scope_Id filter and directly scanning Concept table, which is huge, and using parallelism (degree 8) which is producing Repartition Streams and writes on tempDB. The weird thing is forcing sql statement to avoid parallelism with an option (maxdop 1) the query is much faster.

What's your opinion about join scope_Id to PK_Concept(Id) clustered index? Do you think that could split upper scope_Id index entries and get SQL Optimizer using it?


Mar 23, 2016 at 01:45 PM AngusXY

You have to be careful that you don't index just for one query. The impact of adding or changing an index will touch every operation that updates that table - you would have to measure and decide if that was acceptable. Also you are going to lock table access whilst you change that index - be careful when you do it. But that being said, the only way to see if it helps is to try it, and you seem limited to only having the one environment.

From what you've described it does sound like having scopeid in the indexes could benefit.

Mar 23, 2016 at 02:18 PM Kev Riley ♦♦

Look to SQL Sentry Plan Explorer. It will let you anonymize an execution plan so that you can share it online.

Mar 23, 2016 at 04:15 PM Grant Fritchey ♦♦

or engage with an independent SQL consultant who will sign an NDA ;-)

Mar 23, 2016 at 04:35 PM Kev Riley ♦♦
(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here



Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.



asked: Mar 23, 2016 at 10:07 AM

Seen: 57 times

Last Updated: Mar 23, 2016 at 04:35 PM

Copyright 2018 Redgate Software. Privacy Policy