Determine cause of a Bad Execution Plan

I had a server with some issues the other day (SQL 2005 SP3). I tracked down a job that was running every 5 minutes that was taking 3 minutes to run. Looking at the history it was previously running in under a second.

After looking at the plan it was doing a clustered index scan on a large table. I run sp_updatestats every morning at 3:00 AM and when the job ran at 7:40 AM it took under a second; when it ran at 7:45 AM it started taking 3 minutes. No major changes or data loads occurred from what I can tell between that time. When I ran sp_updatestats again, the plan changed to an index seek and execution time reduced to under a second again.

I am trying to determine what may have caused the query to start using the "Bad Plan" at that time especially since stats were just updated and no major changes to the data occurred. Any ideas?

more ▼

asked Oct 22, 2009 at 03:50 PM in Default

Ken Simmons gravatar image

Ken Simmons
466 1 1 3

What kind of Page Life Expectancy are you getting on the server?
Oct 22, 2009 at 04:34 PM Blackhawk-17
The server is due for a refresh and the PLE is averaging 191. Just trying to do a root cause now for the disruption. Short of adding a USE PLAN hint, I can't guarantee that the query will not resort to the bad plan again.
Oct 22, 2009 at 05:30 PM Ken Simmons
(comments are locked)
10|1200 characters needed characters left

3 answers: sort voted first

Does the job have IF ...ELSE statements?

Is it using a proc?

I belive that sp_updatestats clears the proc cache. So that would resolve the slow performace as new stats are used.

more ▼

answered Oct 22, 2009 at 04:37 PM

sp_lock gravatar image

9.3k 25 28 31

Nope. It was just a simple select statement directly in the job step. It was just wierd how it got cached with such a bad plan. That same job has been running for a long time with no issues.
Oct 22, 2009 at 04:58 PM Ken Simmons
If the job is an SQL Stmt, have you tried running in SSMS and analysing the execution plan? Look at the indexes it is using and check the fragmentation of said indexes.
Oct 22, 2009 at 05:15 PM sp_lock
I did analyze the plan and the indexes. It was doing a scan instead of a seek. I updated stats and it went back to a seek again and fixed the problem. I had updated stats a few hours ago and all of a sudden the plan changed for the query a few hours later without any major changes in the data. I can see the query getting a new plan due to memory pressure, just not sure why it would have made such a bad decision on the plan with relatively up to date stats.
Oct 22, 2009 at 05:50 PM Ken Simmons
(comments are locked)
10|1200 characters needed characters left

Based on the various comments, this may not be an answer for your situation, but I'd like to describe a similar situation we had with a stored procedure and how we resolved it.

A certain procedure was running fine until one day it kept timing out for one of our customers. Upon investigation, we learned that their dataset was roughly twice the average size. That said, the query still should have executed in less than 30 seconds. We ran sp_recompile for that sproc and tried again. This time, it worked fine. Unfortunately, the same problem occurred again 2 days later when the same customer had another large dataset.

We determined the problem was caused by a nightly reindex job that was running. (This was added by a Microsoft consultant and was written to only reindex if fragmentation was over a certain threshold.) We tried adding sp_recompile for the problem stored procedure after the reindex, but it didn't help. Then the consultant came up with a plan that worked. We ran sp_recompile and then immediately ran the procedure for a large dataset. This forced SQL Server to cache the plan for the large dataset. That change was made months ago, and we haven't had a single complaint since.

It turns out the cached execution plan generated for a large dataset was much better for large ones and only very slightly slower for small datasets. On the other hand, the plan generated for a small dataset was optimized for that but terrible for a large one.

I don't know if this will help for the original question, but it may help others with similar issues.

more ▼

answered Oct 23, 2009 at 01:19 AM

Tom Staab gravatar image

Tom Staab
5.8k 6 8 10

(comments are locked)
10|1200 characters needed characters left

From the sounds of it, it's probable that you have some parameters passed to your query that cause a bad plan and some that cause a good plan, independent of the statistics. You could run a server side trace to try to capture recompile events or take a look at the sys.dm_exec_query_stats to see when the plan was compiled. To fix it you would want to address it like it was a parameter sniffing problem (which it sort of is). Either use the OPTIMIZE FOR query hint and supply a good parameter, or use local parameters to get a more generic plan. You might even consider creating the query WITH RECOMPILE to get a new plan each time. It really depends on if you can afford the recompile cost.

more ▼

answered Oct 23, 2009 at 09:00 AM

Grant Fritchey gravatar image

Grant Fritchey ♦♦
103k 19 21 74

It's just doing a getdate() -30 in the WHERE clause to get rid of some old records. It's also doing an ORDER BY. When I took the ORDER BY off it used a SEEK instead of a SCAN. Updated stats and it chose a SEEK again. Unfortunately, the compatability level is still set to 80 on the database so I wasn't able to take advantage of all the DMV's for troubleshooting. That was my fisrt thought, I was trying to use the DMV's until I figured out the compatability level was wrong.
Oct 23, 2009 at 12:06 PM Ken Simmons
(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

New code box

There's a new way to format code on the site - the red speech bubble logo will automatically format T-SQL for you. The original code box is still there for XML, etc. More details here.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here



Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.



asked: Oct 22, 2009 at 03:50 PM

Seen: 5103 times

Last Updated: Oct 23, 2009 at 05:03 AM