x

How do you approach a client/database with a performance issue?

It's a typical scenario, you have a client / database with a performance issue, and there isn't much information available about what might be wrong apart from 'it's going slowly'. Here is what I tend to do:

  • Understand the environment in which the database is running, hardware, installation and network wise
  • Try and get more information about where the performance problem lies (i.e. are there specific operations that are slow, is it slow at particular times of day etc)
  • Understand as much as possible the applications that are using the data, and find out how they interact with the database
  • Get a good handle on any existing maintenance plans that may or (usually) may not be in place

After that, I would start visiting the DMVs (for example [sys].[dm_exec_procedure_stats] and [sys].[dm_exec_query_stats]) to see if there are any obvious outliers, then go onto the plans and inspect those to see if there are any obvious problems.

Where would you start, given a problem like that? Do you find that you have a set of 'starter scripts' that give you some data that you can begin to work with?

The reason I ask is that I had exactly this scenario a couple of weeks back, and I thought I would ask you guys just to make sure I'm not going about it a completely numpty way!

more ▼

asked Apr 30, 2010 at 04:47 AM in Default

Matt Whitfield gravatar image

Matt Whitfield ♦♦
29.4k 61 65 87

Matt, just about to send an email and I dont have your address - should I use Atlantis Interactive ?
May 05, 2010 at 10:29 AM Fatherjack ♦♦
@Fatherjack - yep, mattw [at] atlantis-interactive.co.uk is me :)
May 05, 2010 at 10:46 AM Matt Whitfield ♦♦
(comments are locked)
10|1200 characters needed characters left

2 answers: sort voted first

Your outline is pretty accurate to how I'd do it too. Address the server & the physical architecture right at the start. This is seldom the source of the problem, so you should be able to quickly determine whether or not things are set up right. Fix what's wrong and don't look back.

After that... things get interesting.

I usually take one of two approaches. Approach #1, gather query metrics. You can set up a server-side trace or you can look at the DMV's like you've already mentioned. Figure out what's called most frequently, what runs the longest, what uses the most cpu, what uses the most i/o, and, which queries are on more than one of these lists. With that in hand, you can begin tuning. Gathering query metrics like this has the added benefit of showing you how the queries are called. For example, we had an app that called every query twice. It was just bad code, but if we weren't looking at a trace, we'd have just thought it was active. I've also seen instances where a lookup table is called hundreds of times from one connection because the developers didn't cache data... All this type of thing is what you can get from the trace.

The other approach, and it's pretty popular, is to look at the wait states. What are you waiting on? Where are the queues too long? Identify these locations, the associate those waits to procedures and begin to tune that way.

Finally, it's always best to talk to the business users. Where do they feel pain. Maybe you think, based on good data, that a given procedure needs to be tuned because it's called hundreds of times an hour, but the users are more acutely aware of a single report that runs once a day or something. Making them happy is as important as a more rigourous approach.

With a system already in production, mature or not, you can't really establish a baseline. The baseline is today. The trick is to not waste your time tuning stuff that's not important, either to the business, or to the overall performance metrics.

more ▼

answered Apr 30, 2010 at 08:44 AM

Grant Fritchey gravatar image

Grant Fritchey ♦♦
98.1k 19 21 74

Epically good answer... Do you mind if I drop you an email with some questions on a semi-related topic?
Apr 30, 2010 at 09:12 AM Matt Whitfield ♦♦
Thanks, it's not like I wrote a book on this or anything... oh wait, I did... Kidding. Of course you can ask. I'll try to help.
Apr 30, 2010 at 09:19 AM Grant Fritchey ♦♦
Brilliant - I'll drop you an email when I get five minutes in between looking after chickens and kittens (don't ask!)
Apr 30, 2010 at 10:15 AM Matt Whitfield ♦♦
+1 to Matt for stating that chicken_importance > Grant_importance > kitten_importance. Can you tell its Friday?
Apr 30, 2010 at 11:25 AM Fatherjack ♦♦
Don't worry, there's no ORDER BY clause when you're married to the female equivalent of dr. Doolittle
Apr 30, 2010 at 04:34 PM Matt Whitfield ♦♦
(comments are locked)
10|1200 characters needed characters left

I would hope (and its a big hope) that there would be some benchmark of what normal is so that we could determine how slow it actually is, it may be that its not far off and therefore not worth investigating at the moment.
I would then look to see what had changed in the environment, new systems added to the server? Other systems going slow?
Then start to move in towards the data, are the server vital signs ok - CPU, HDD, paging etc what is using the resources, regardless of whether they are high or low.
Assuming we are still blaming SQL and our rogue system - then I look for locks, blocks, user/connection count, index frag, and then the bunch of trouble shooting scripts get pulled out of the toolbag ...

more ▼

answered Apr 30, 2010 at 05:06 AM

Fatherjack gravatar image

Fatherjack ♦♦
42.3k 75 78 108

When you're coming into a system, you almost never have a baseline. I've never seen one anyway. Other than that, you've got as good an approach here as any I've seen. I'd be curious what your scripts look like. I actually don't have any magic ones any more. I just focus on collecting data & then munge that to figure out what to fix.
Apr 30, 2010 at 09:21 AM Grant Fritchey ♦♦
@Grant, Thanks. I figured its not common. We actually have some systems here that we baseline monthly, the UI is timed (only to the nearest s) to do standard user processes. We use this to decide if a "its going slow again" call are justified and also to compare new application versions - we get at least 2/year from our suppliers. Most often historically its been AV, IIS or an index frag issue. Happy to share the scripts ... email ...?
Apr 30, 2010 at 09:31 AM Fatherjack ♦♦
grantedd at gmail dot com. It's public but I don't like posting it unobfuscated.
Apr 30, 2010 at 11:43 AM Grant Fritchey ♦♦
Thanks, got it, feel free to delete now. Only actually meant did you want to share them over email, would have gone to scarydba to get it .. !
May 01, 2010 at 05:20 AM Fatherjack ♦♦
@Fatherjack - I'd be interested in seeing your toolbox scripts too actually! I think you should have my email? I'll drop you a line just in case.
May 01, 2010 at 12:23 PM Matt Whitfield ♦♦
(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

New code box

There's a new way to format code on the site - the red speech bubble logo will automatically format T-SQL for you. The original code box is still there for XML, etc. More details here.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.

Topics:

x51

asked: Apr 30, 2010 at 04:47 AM

Seen: 1263 times

Last Updated: Apr 30, 2010 at 04:47 AM