x

Sql Server Recursive Stored Procedure

Want to create a stored procedure with CTE and want to know what the performance hit will be:

 DECLARE @Directory Table(
     DirId INT,
     ParentId INT
 )
 DECLARE @Files Table(
     FileId INT, 
     DirId INT
 )
 
 INSERT INTO @Directory SELECT 1, NULL
 INSERT INTO @Directory SELECT 2, 1
 INSERT INTO @Directory SELECT 3, 1
 INSERT INTO @Directory SELECT 4, 2
 
 INSERT INTO @Files SELECT 1, 1
 INSERT INTO @Files SELECT 2, 1
 INSERT INTO @Files SELECT 3, 2
 INSERT INTO @Files SELECT 4, 2
 INSERT INTO @Files SELECT 5, 3
 INSERT INTO @Files SELECT 6, 3
 INSERT INTO @Files SELECT 7, 4
 INSERT INTO @Files SELECT 8, 4
 
 ;WITH Directories AS (
         SELECT  DirId,
                 ParentID
         FROM    @Directory
         WHERE   DirId = 2 
         UNION ALL
         SELECT  d.DirId,
                 d.ParentID
         FROM    @Directory d INNER JOIN
                 Directories p   ON  d.ParentId = p.DirId
 )
 SELECT  *
 FROM    Directories d INNER JOIN
         @Files f ON d.DirId = f.DirId
more ▼

asked Dec 09, 2010 at 01:12 AM in Default

avatar image

FaultyTrucker
21 1 1 2

How do you add the grey background?

Dec 09, 2010 at 01:17 AM FaultyTrucker

You mark your code and click the button with the numbers 1 & 0 (preformatted text)

Dec 09, 2010 at 01:23 AM Håkan Winther

Is this somekind of homework maybe?

Dec 09, 2010 at 01:24 AM Håkan Winther

This recursion only goes up one level from the directory that you choose, is that wanted? Try changing DirId in the root level of the recursion from 2 to 4 and see that you won't go up to the top level of the directory list.

This type of recursion will not be terribly slow. If you look at the execution plan you will see that there are two table scans for @Directory, two work tables for the recursion and a table scan of @Files. If you index correctly you should see no great problems.

Dec 09, 2010 at 06:47 AM WilliamD
(comments are locked)
10|1200 characters needed characters left

4 answers: sort voted first

To see the performance hit you need to test your query against another solution. Create two procedures, one with this code and one with another solution and take a look at different execution plans and execution statistics.

Have you figured out the other solution yet?

Keep in mind that SQL server doesn't use statistics for table variables and the engine assumes only one record in table variables. That can cause inefficient execution plans if you have many records in the table variable.

more ▼

answered Dec 09, 2010 at 01:21 AM

avatar image

Håkan Winther
16.6k 37 45 57

(comments are locked)
10|1200 characters needed characters left

The only way to know what the performance will be is to measure the performance. One way to do this, assuming we're working against a dev or test environment, don't do this in production, is to create a query window, put the two queries you're interested in, in the same window. On the standard tool bar find the little icon that shows "Client Statistics" in the hint, or click on the Query menu and select "Include Client Statistics" or hit shift-alt-s. This will turn on client statistics. Highlight the first query, run it, and you'll get some performance metrics. Highlight the second query, run that, you'll get the second queries performance metrics, side-by-side with the original, showing which values have gone up & down. For completeness, to take into account any caching that's occurred, do it one more time for each query. Now you'll have four sets of metrics that you can compare against each other.

Is this the best way to do this? Not necesarily, but it is one way. There other ways with different strengths & weaknesses, set up time, etc.

more ▼

answered Dec 09, 2010 at 05:54 AM

avatar image

Grant Fritchey ♦♦
137k 20 43 81

(comments are locked)
10|1200 characters needed characters left

The biggest problem with measuring performance of any SQL is that performance is meaningless outside the context of the data. With a SQL Query, if you are doing it right, you're actually just asking SQL Server to do something but you're not telling it HOW to do it, and the query optimizer is likely to choose different strategies according to the amount and distribution of the data. This means that a solution that works well with one set of data can be complete pants with a larger, or differently-distributed, set.

When you're judging performance, to be completely confident you really need to do it on a range of trial data of various sizes and distributions. Very few people actually do that, but it is surprising how badly some SQL 'scales' to large sets of data.

more ▼

answered Dec 10, 2010 at 02:44 PM

avatar image

Phil Factor
4.2k 8 23 21

  • You are completely right, I think we all have seen a lot of terrible example where no adequate performance tests have been performed. "Know your data" is an important lesson to learn. You have to know the size, growth, distribution and even the workload. Some queries can perform well when nothing else is executing, but under full workload it takes for ever. BUT the lesson from this question is that you can't know the performance hit without testing the performance and compare to something else. If you don't have anything to compare to, you may be forced by the business demands to use some terrible code.

Dec 12, 2010 at 11:55 PM Håkan Winther
(comments are locked)
10|1200 characters needed characters left
 CREATE TABLE Directory (
     DirPath hierarchyid PRIMARY KEY,
     DirId int,
     ParentId int
 )
 CREATE TABLE Files (
     FileId int, 
     DirId int
 )
 -- just insert dummy values for hierarchyid, they are updated later
 INSERT Directory 
 VALUES('/1/', 1, NULL), ('/2/', 2, 1), ('/3/', 3, 1), ('/4/', 4, 2)
 
 INSERT Files
 VALUES(1, 1), (2, 1), (3, 2), (4, 2), (5, 3), (6, 3), (7, 4), (8, 4)
 
 -- update table with hierarchy
 ;WITH hierarchy AS (
     SELECT '/' + CAST(DirId AS varchar(MAX)) [Path], DirId
     FROM Directory d
     WHERE d.ParentId IS NULL
     UNION ALL
     SELECT CAST(hierarchy.[path] + '/' + 
            CAST(d.DirId AS varchar(MAX)) AS varchar(MAX)), d.DirId
     FROM Directory d 
     JOIN hierarchy ON (d.parentID = hierarchy.DirId)
     )
 UPDATE d SET [DirPath] = h.[Path] + '/'
 FROM hierarchy h
 JOIN Directory d ON (h.DIRID = d.DIRID)
 
 
 -- once the hierarchy is up to date you can run queries against it, 
 -- this should beat most recursive queries
 DECLARE @root hierarchyid;
 
 SELECT @root = [DirPath]
 FROM Directory
 WHERE DirId = 2 -- set your top level directory
 
 SELECT d.DirId, f.FileId
 FROM Directory d
 JOIN Files f ON (f.DirId = d.DirId)
 WHERE [DirPath].IsDescendantOf(@root) = 1 
 ORDER BY  d.DirId, f.FileId
  
 DROP TABLE Directory, Files
more ▼

answered Dec 10, 2010 at 07:36 PM

avatar image

Scot Hauder
6.5k 13 16 22

(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.

Topics:

x2137
x1076
x57
x24

asked: Dec 09, 2010 at 01:12 AM

Seen: 3769 times

Last Updated: Dec 09, 2010 at 01:16 AM

Copyright 2016 Redgate Software. Privacy Policy