Without having extensive knowledge of several complex scripts that developers created to query and manipulate large amounts of data, what is the best way to evaluate the performance (expected execution performance vs actual)? Thank you.
Well, this is a big topic. You can use a couple of methods, like: - Execution plan (estimaded vs actual) - SET STATISTICS IO ON; - A lot of DMV:s - Profiler - Performance counters You can use a combination of these techniques to evaluate the performance to find the most expensive queries and to find the solution for them, but it takes some experience to use any of these tools and to tune the queries. You can read a couple of posts about this on this site, like:
Wow, talk about an open ended question. There is so much possible in performance tuning that many people have written books on the topic. Without any knowledge of the code it would be hard to make suggestions. You need to go to the basics; take a look at what your hardware can do (baselining) and run the scripts to see if the runtime meets expectations. You can use windows performance counters and profiling software to watch for bottlenecks (hardware and software). These bottlenecks can then be investigated to see what is actually causing them. All links of the "chain" have to be looked at or considered. Your query may be super fast, but is being called inefficiently on the client side. You may have awesome client and server side code, but poor network capacity. As soon as you have an overview of where the limits are, you can concentrate on investigating where the limits are being hit and why they are being hit. You may, of course, have completely improper expectations of code execution time. If your server has 4GB RAM and a single core processor and the code is manipulating 30GB of data, you cannot expect it to finish within 10 seconds.
The one thing that you must do is establish a baseline. You need to know how things are performing now in order to evaluate how things perform later. Without a baseline, you have nothing to compare to. It could be perfectly normal for a query to take 60 minutes to run based on the query, the data, the structures and the workload. Or, that could be a true nightmare, but without a baseline set of measurements so that you know how the system is behaving normally, you can't possibly determine if that 60 minutes is good or bad. It just is. Three general areas of monitoring are necessary. You have to monitor the system, SQL Server itself, and the queries. The system and SQL Server are mostly monnitored through gathering metrics using Performance Monitor. You can get additional information about SQL Server through Dynamic Management Objects (DMO). Queries are monitored through the use of server side traces or extended events. If you're on 2005 or greater, I strongly recommend learning extended events and skip trace. You can also get additional information about queries through DMOs. With these tools in place, measure the system today. There's your baseline. Measure it tomorrow or after an update or a complaint from a user or... whatever. Now you can compare one value to the other. Now you've determined how things are running. Next is to figure out why and what you can do to make them run faster.