question

Bartolo avatar image
Bartolo asked

What could cause query execution-plan change?

Hello, Recently, I noticed quite strange behaviour of execution plan for a query which processes data in batches ca. 10000 records. The query joins two very large tables (ca. 20 mln and 160mln) by using indexes. At the beginning query execution plan was perfect, database engine was using only index seeks and execution of the query was not longer that 200ms. When the batch process reached 90% of the records which have to be computed (ca. 20mln) query execution plan instantly changed and index operator on the bigger table started to use index scan, I do not have to mention that execution times increased 1000 times. I thought that statistic or indexes could not have been updated so I did both suitable thinks which came to my mind: update statistic and de-fragment of the index. It just helped for very short time. What else could I do to come back to previous efficient execution plan? I would like to add that database server (version 2000) is not overloaded by other heavy process, tables are not changed very often and the same batch process works fine on SQL Server 2008. Thanks in advance,
sql-server-2000execution-planquery-tuning
4 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

But if even if you do update the stats, you might end up with a query plan that's different and possibly slower, so keep that in mind as you work with it.
3 Likes 3 ·
If you were to move to SQL 2008, you could think along the lines of some of the newer query hinting / options so that a particular plan could be achieved regardless of what the statistics tell the optimiser to do.
1 Like 1 ·
Hi guys, Thanks for all your detail explanation. It seems that periodic update of statistics is the main point of consideration.
0 Likes 0 ·
Following hints helped in my problem to keep the same execution plan… when we had noticed that nested looped join had switched to merge join we added loop hint and force order option at the end of the select statement. Now query executes as before ca. 200ms Thanks again for all advices and interesting discussion how SQL Server deals with statistics!
0 Likes 0 ·
WilliamD avatar image
WilliamD answered
@Bartolo - inserting/updating 20 million records can and will change the data access methods SQL Server will use. The plan change is something that I would think to be normal in the circumstances you described. This sounds like a bulk-load operation that is not happening all the time. If this is right, I suggest you take a look at doing planned index and statistics maintenance immediately after the data load, to compensate for the data-skew that you are experiencing. SQL 2008 will handle this sort of thing much better than 2000, because the underlying storage and query execution systems have been redesigned. The redesign took this sort of thing into consideration.
2 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Execution plans are formed based off of stats. By you saying you can update stats and it runs fine for awhile leads me to believe your stats are getting out of date. How often are you statistics updated on a daily, weekly, monthly basis? Please tell us you don't just rely on auto update stats to keep them current.
1 Like 1 ·
Hi William, By processing in batches I meant just simple selects with join between that two tables and as I wrote at the end, both tables are not really often changed (instered/updated/delete) which I am aware causes execution plan change. If data change happens it affects maybe few hundreds records. Would it cause so severe execution-plan change?
0 Likes 0 ·
Pavel Pawlowski avatar image
Pavel Pawlowski answered
Such large amount of data, could change the index selectivity and also the statistics. In your case it seems, that the selectivity is still good enough as the optimizer uses the index. If you you have defragmented the index and update statistics as you have wrote, then it seems, that the distribution of the values in statistics changed in a way, that optimizer now started to use index scan instead of index seek. It means, that the optimizer determined from the statistics, that the index scan will be less costly than repeated seeks. From the symptoms, it is problem of the statistics. If you have a very big amount of records in the table, then the inserts doesn't fire automatic statistics recompute for the table and the latest values are not covered by the statistics. Then in this scenario the SQL server also chooses index scan instead of index seek, as it doesn't know about th new values inserted. I suggest periodic statistics update.
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Grant Fritchey avatar image
Grant Fritchey answered
Basically what's happening is that as data changes you get an automated update of the statistics. It works off a formula, 600 rows + 10% of the number of rows in the table. Once that value is hit, you get an automatic scanned update of statistics. This leads to a recompile of the stored procedure. The scan is a sampled scan, not a full scan. As the data changes, depending on the data in your table, you may or may not have a well distributed set of data. If it's not well distributed, you may see a bad plan created as the statistics are updated based on sampled data. If it is well distributed, the plan you get may be accurate for the new statistics, just not what you want.
4 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

+1 for the added explanation of the auto-stats update and how it can (unintentionally) bite you if the sample is bad. This only fits if the DB is set to auto-update stats though, right? (the default is on, but I have seen it turned off before).
1 Like 1 ·
As mentioned here several times, the problem will be in the not up to date statistics. If into a multi milion rows table a 10000 records batches are added and then the records are joined in next batch, depending on the values inserted (especialy in incremental values), the newly inserted records are mostly not covered by the statistics and query optimizer then choses the index scan instead of index seek. As in case of the table with 20 000 000 records you need to insert over 2 000 000 records to trigger auto statistics update. This was a root cause of several problems similar to this one which were posted here some time ago.
1 Like 1 ·
Yeah, I'm assuming that's on. It's pretty rare, although not unheard of, for that to get turned off.
0 Likes 0 ·
Exactly. It works both ways. If you cross the threshold, things change whether you want them to or not. If you haven't crossed the threshold, things remain the same whether you want them to or not.
0 Likes 0 ·

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.