x

Query Execution Length

Hi,

I'm a relative newcomer to SQL so apologies if this is stupid or obvious. I'm trying to write a query but am a bit stumped about the execution performance blowing up at what seems to me a pretty innocuous point.

Rather than dump the script code out I'll lay it out in generics; I'm happy to post the exact script but I suspect it's a little long and won't really add anything to my question:

  1. DECLARE and SET some variables including a Table Value Parameter (TVP)
  2. INSERT INTO the TVP around 6,000 rows as an extraction from a table with 2 billion rows. This uses an index on the source table and takes about 20 secs, which I know because I'm printing to the message window so no issue here (I think).
  3. DELETE FROM the TVP to strip out about 500 rows of bad data. I think this is easier than trying to JOIN to the mega table in step 2, which would be needed to extract the right data in one go.
  4. A series of 10 Common Table Expressions (CTEs) to process the data in the TVP. It is on the tenth CTE that performance blows up.

The tenth CTE is doing a LEFT JOIN between a preceding CTE with c. 1800 rows and 2 columns and another preceding CTE with c. 1500 rows and 3 columns. These two CTEs take about 25 secs and 45 secs to calculate respectively, the bulk of which time is inserting into the TVP each time. However when I then come to LEFT JOIN these two performance always blows up. I've cancelled the query each time it got over a couple of minutes before but I've let it run this time whilst I look on the internet. The query is still executing at 54 minutes and counting!

So to my question: have you ever heard of this sort of thing? If so what could be causing it? How do I debug? And the million dollar question, how do I fix it?

Any replies gratefully received.

N.B. I'm running an actual execution plan on the query but can't see anything yet as the query hasn't finished executing yet. I'm SQL Server 2008 if that makes any difference.

Cheers, Maccas

XML Plan attached: [XMLPlan.zip][1]

[1]: http://ask.sqlservercentral.com/storage/temp/31-XMLPlan.zip
XMLPlan.zip (33.3 kB)
more ▼

asked Feb 05, 2012 at 05:46 PM in Default

maccas gravatar image

maccas
2 1 1 2

Wow thanks for replying so quickly on a Sunday night.

I do get that CTEs are evaluated & collapsed at run-time. I think I described them that way as the way I'm building up this query I'm using each one in a chain to move my manipulation along. Obviously if you know a better way to be doing this I'd be delighted to hear / learn.

The code is attached below. I'm not sure you need much example data as the TVP defines the shape of the data I'm working with. AccountOutput is the monster table with a bit over 2 billion rows. Tips on on reducing the steps involved obvious welcomed but this is above and beyond the sort of help I was expecting.

I use NOLOCKS as this query can hit a production database database during work hours, I don't care about dirty reads and cannot afford to lock up the DB for any length of time.

-- Use NOLOCK everywhere SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;

-- ################################################### -- Declare and set variables -- ###################################################

DECLARE @CurrBudScen INT; DECLARE @PrevBudScen INT; DECLARE @OutputMsg VARCHAR(200); DECLARE @AOData TABLE (AccountID INT NOT NULL ,LeaseID INT NOT NULL ,ScenarioID INT NOT NULL ,PeriodID INT NOT NULL ,Value DECIMAL(12,2) NOT NULL);

SET @CurrBudScen = 6410; SET @PrevBudScen = 4757;

-- ################################################### -- Get Data from AccountOutput -- ###################################################

-- Current Budget -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

SET @OutputMsg = CONVERT(VARCHAR(30), GETDATE(), 8) + ' - Get Data from AccountOutput: Current Budget' PRINT @OutputMsg

INSERT INTO @AOData

SELECT
    ao.AccountID
    ,ao.EntityID AS [LeaseID]
    ,ao.ScenarioID
    ,ao.PeriodID
    ,ao.Value

FROM AccountOutput ao

WHERE
    ao.AccountID IN (82,83) -- New and Previous Rent on Rent Review
    AND ao.ScenarioID = @CurrBudScen
    AND ao.IsComparisonCopy = 0
    AND ao.IsRefScenario = 0;

-- (3431 row(s) affected)

-- Previous Budget -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

SET @OutputMsg = CONVERT(VARCHAR(30), GETDATE(), 8) + ' - Get Data from AccountOutput: Previous Budget' PRINT @OutputMsg

INSERT INTO @AOData

SELECT
    ao.AccountID
    ,ao.EntityID AS [LeaseID]
    ,ao.ScenarioID
    ,ao.PeriodID
    ,ao.Value

FROM AccountOutput ao

WHERE
    ao.AccountID IN (82,83) -- New and Previous Rent on Rent Review
    AND ao.ScenarioID = @PrevBudScen
    AND ao.IsComparisonCopy = 0
    AND ao.IsRefScenario = 0;

-- (3549 row(s) affected)

-- Remove Orphaned Data -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

SET @OutputMsg = CONVERT(VARCHAR(30), GETDATE(), 8) + ' - Get Data from AccountOutput: Remove Orphaned Data' PRINT @OutputMsg

DELETE aod

FROM @AOData aod

LEFT JOIN Lease l ON
    l.LeaseID = aod.LeaseID

LEFT JOIN LeaseScenarioDetail lsd ON
    lsd.LeaseID = l.LeaseID
    AND lsd.ScenarioID = aod.ScenarioID

LEFT JOIN Asset a ON
    a.AssetID = l.AssetID

LEFT JOIN AssetScenarioDetail asd ON
    asd.AssetID = a.AssetID
    AND asd.ScenarioID = aod.ScenarioID

LEFT JOIN [Site] s ON
    s.SiteID = a.SiteID

LEFT JOIN SiteScenarioDetail ssd ON
    ssd.SiteID = s.SiteID
    AND ssd.ScenarioID = aod.ScenarioID

WHERE lsd.RecordStatus = 316 OR asd.RecordStatus = 316 OR ssd.RecordStatus = 316 OR lsd.RecordStatus IS NULL OR asd.RecordStatus IS NULL OR ssd.RecordStatus IS NULL;

-- (498 row(s) affected)

-- ################################################### -- CTEs -- ###################################################

-- Move new rent on review and previous rent on review -- into columns on th same rwo for current budget -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

WITH CurrBudgetData AS

(SELECT
    aod1.LeaseID
    ,aod1.PeriodID
    ,aod1.Value AS [NewRentOnReview]
    ,aod2.Value AS [PrevRentOnReview]

FROM @AOData aod1

    INNER JOIN @AOData aod2 ON
       aod2.LeaseID = aod1.LeaseID
       AND aod2.ScenarioID = aod1.ScenarioID
       AND aod2.PeriodID = aod1.PeriodID
       and aod2.AccountID = 83 --Previous Rent on Rent Review
WHERE
    aod1.AccountID = 82 --New Rent on Rent Review
    AND aod1.ScenarioID = @CurrBudScen)

-- Move new rent on review and previous rent on review -- into columns on th same rwo for previous budget -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

,PrevBudgetData AS

(SELECT
    aod1.LeaseID
    ,aod1.PeriodID
    ,aod1.Value AS [NewRentOnReview]
    ,aod2.Value AS [PrevRentOnReview]

FROM @AOData aod1

    INNER JOIN @AOData aod2 ON
       aod2.LeaseID = aod1.LeaseID
       AND aod2.ScenarioID = aod1.ScenarioID
       AND aod2.PeriodID = aod1.PeriodID
       and aod2.AccountID = 83 --Previous Rent on Rent Review
WHERE
    aod1.AccountID = 82 --New Rent on Rent Review
    AND aod1.ScenarioID = @PrevBudScen)

-- Create map between current budget and previous budget -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

-- Match rent reviews in same month under same SiteID

,CurrToPrevMAP1 AS

(SELECT
    cbd.LeaseID AS [CBLeaseID]
    ,pbd.LeaseID AS [PBLeaseID]
    ,cbd.PeriodID AS [PeriodID]

FROM CurrBudgetData cbd

    INNER JOIN Lease lc ON
       lc.LeaseID = cbd.LeaseID

    INNER JOIN Asset ac ON
       ac.AssetID = lc.AssetID

    INNER JOIN Asset ap ON
       ap.SiteID = ac.SiteID

    INNER JOIN Lease lp ON
       lp.AssetID = ap.AssetID

    INNER JOIN PrevBudgetData pbd ON
       pbd.LeaseID = lp.LeaseID
       AND pbd.PeriodID = cbd.PeriodID)

-- Pick from first list those matches with same LeaseID

,CurrToPrevMAP2 AS

(SELECT
    ctpm.CBLeaseID
    ,ctpm.PBLeaseID
    ,ctpm.PeriodID

FROM CurrToPrevMAP1 ctpm

WHERE ctpm.CBLeaseID = ctpm.PBLeaseID)

-- Pick from first list those matches with different LeaseID -- that also don't have a same leaseID match (this will be demmed -- to over-rule the non-matching LeaseID initial match)

,CurrToPrevMAP3 AS

(SELECT
    ctpm.CBLeaseID
    ,ctpm.PBLeaseID
    ,ctpm.PeriodID

FROM CurrToPrevMAP1 ctpm

WHERE
    ctpm.CBLeaseID <> ctpm.PBLeaseID
    AND ctpm.CBLeaseID NOT IN (SELECT CBLeaseID FROM CurrToPrevMAP2))

-- Aggregate two sub-selections above -- i.e. remove non-matching LeaseID initial matches where same -- LeaseID matches also found under the same site

,CurrToPrevMAP4 AS

(SELECT * FROM CurrToPrevMAP2

UNION ALL

SELECT * FROM CurrToPrevMAP3)

-- Get list of all insatnces of unique CBLeaseID & PeriodID combos

,CurrToPrevMAP5 AS

(SELECT DISTINCT CBLeaseID, PeriodID
FROM CurrToPrevMAP4 AS A
WHERE
    (SELECT COUNT(CBLeaseID) 
     FROM CurrToPrevMAP4 AS B
     WHERE B.CBLeaseID = A.CBLeaseID
     AND B.PeriodID = A.PeriodID) = 1)

-- Use above unique list to NULL out any residual mutliple -- matches on the grounds that these are too difficult -- to programatically match

,CurrToPrevMAP6 AS

(SELECT DISTINCT
    ctpm4.CBLeaseID
    ,CASE WHEN ctpm5.CBLeaseID IS NULL THEN NULL ELSE ctpm4.PBLeaseID END AS [PBLeaseID]
    ,ctpm4.PeriodID
FROM CurrToPrevMAP4 ctpm4
    LEFT JOIN CurrToPrevMAP5 ctpm5 ON
       ctpm5.CBLeaseID = ctpm4.CBLeaseID
       AND ctpm5.PeriodID = ctpm4.PeriodID)     

-- Get distinct Lease, PeriodID list from current budget

,CurrToPrevMAP7 AS

(SELECT DISTINCT
    cbd.LeaseID AS [CBLeaseID]
    ,cbd.PeriodID AS [PeriodID]

FROM CurrBudgetData cbd)

-- Final Map!!! -- Left Join unique list from current budget to mapped IDs for -- final map

,CurrToPrevMAP AS

(SELECT
    ctpm7.CBLeaseID
    ,ctpm6.PBLeaseID
    ,ctpm7.PeriodID

FROM CurrToPrevMAP7 ctpm7

    LEFT JOIN CurrToPrevMAP6 ctpm6 ON
       ctpm7.CBLeaseID = ctpm6.CBLeaseID
       AND ctpm7.PeriodID = ctpm6.PeriodID)

-- ################################################### -- Main Query -- ###################################################

-- takes 45 seconds SELECT * FROM CurrToPrevMAP6;

-- takes 25 seconds SELECT * FROM CurrToPrevMAP7;

-- takes over two hours and counting? SELECT * FROM CurrToPrevMAP;

-- Reset default LOCK behaviour SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
Feb 05, 2012 at 07:31 PM maccas

The line breaks in that script got all messed up. I just copied and pasted from SSMS, is there something else I should be doing?

I tried attaching the XML actual execution plan from the query, which finally finished running in an epic 2 hours 34 mins, but it exceeds the attachment file size limit. I'll try uploading the xml file elsewhere on the internet and posting a link.

Maccas
Feb 05, 2012 at 07:38 PM maccas
Here's the execution plan: http://www.macadie.co.uk/files/ExecutionPlan.xml
Feb 05, 2012 at 07:42 PM maccas
(comments are locked)
10|1200 characters needed characters left

4 answers: sort newest

You can post here at least Estimated execution plan which you can receive even without executing the query as @Kev Riley mentioned

The issue with TVP and table variables in general is, that there are no statistics generated for them in SQL server and as a result, query optimizer always evaluates the table variables as if they have single record.

So if you have for example a few thousands of records in the table variable and joining it with multi-bilion table, it can be a real performance killer if SQL server chooses wrong query plan because of missing statistics.

In this case it could be better use regular temp table for which statistics are generated and use this in join.

This is one of possible scenarios. But better would be to see a real plan (at least the estimated one).
more ▼

answered Feb 05, 2012 at 06:44 PM

Pavel Pawlowski gravatar image

Pavel Pawlowski
22.3k 9 11 21

After looking on the query plan [XMLPlan.zip][1], it's really crazy and better to re-work the query.

Eg. at least split it and put intermediate result into some temp table and then continue. As it is written now, the JOINS to big tables are replicated by the subsequent CTEs.

What more as I have mentioned the missing statistics on the Table Variable are also big problem in this situation, so I suggest to replace it by regular temp table at least too.

[1]: http://ask.sqlservercentral.com/storage/attachments/31-XMLPlan.zip
Feb 05, 2012 at 08:35 PM Pavel Pawlowski
(comments are locked)
10|1200 characters needed characters left

That is one seriously hairy execution plan. I'm not going to try to pick it all apart, but it looks like you're falling into a classic hole. You're trying to use row by row logic on TSQL, which is a set-based processing system. Sometimes you do need to break data down into steps as you've done, but most of the time, this is an indication that you have some flaws in the data structure which need fixing or that you haven't tried enough using the set-based methods within TSQL.

My suggestion, take a step back and reassess what data you need to get out of this query and then start exploring using a single statement to eliminate it. For example, just that first step after you load the table valued function, deleting data. That means you removed information that you just loaded. You made the assumption that joining to a table would be slower. It might be, but, with propery indexes and good code, joins are probably going to be substantially faster, not slower.

Explore the rest of the code thinking along those lines.
more ▼

answered Feb 06, 2012 at 10:51 AM

Grant Fritchey gravatar image

Grant Fritchey ♦♦
98.1k 19 21 74

(comments are locked)
10|1200 characters needed characters left

Oh my GOD. @maccas Seems like while trying to simplify the logic by breaking into pieces (With use of CTEs), you have over-complicated the query. There is redundant work done in the script

For e.g.

,CurrToPrevMAP1 AS

(SELECT
    cbd.LeaseID AS [CBLeaseID]
    ,pbd.LeaseID AS [PBLeaseID]
    ,cbd.PeriodID AS [PeriodID]

FROM CurrBudgetData cbd

    INNER JOIN Lease lc ON
       lc.LeaseID = cbd.LeaseID

    INNER JOIN Asset ac ON
       ac.AssetID = lc.AssetID

    INNER JOIN Asset ap ON
       ap.SiteID = ac.SiteID

    INNER JOIN Lease lp ON
       lp.AssetID = ap.AssetID

    INNER JOIN PrevBudgetData pbd ON
       pbd.LeaseID = lp.LeaseID
       AND pbd.PeriodID = cbd.PeriodID)

-- Pick from first list those matches with same LeaseID

,CurrToPrevMAP2 AS

(SELECT
    ctpm.CBLeaseID
    ,ctpm.PBLeaseID
    ,ctpm.PeriodID

FROM CurrToPrevMAP1 ctpm

WHERE ctpm.CBLeaseID = ctpm.PBLeaseID)

-- Pick from first list those matches with different LeaseID -- that also don't have a same leaseID match (this will be demmed -- to over-rule the non-matching LeaseID initial match)

,CurrToPrevMAP3 AS

(SELECT
    ctpm.CBLeaseID
    ,ctpm.PBLeaseID
    ,ctpm.PeriodID

FROM CurrToPrevMAP1 ctpm

WHERE
    ctpm.CBLeaseID <> ctpm.PBLeaseID
    AND ctpm.CBLeaseID NOT IN (SELECT CBLeaseID FROM CurrToPrevMAP2))

-- Aggregate two sub-selections above -- i.e. remove non-matching LeaseID initial matches where same -- LeaseID matches also found under the same site

,CurrToPrevMAP4 AS

(SELECT * FROM CurrToPrevMAP2

UNION ALL

SELECT * FROM CurrToPrevMAP3)

could be replaced by one CTE i.e. CurrToPrevMAP1 itself

As far as the rest of the logic is concerned, I think what you need is that PBLeaseID should be NULL in case the there are mutliple residuals i.e. COUNT(LeaseID) > 1. So your FINAL CTES should be (Since, I do not have the sample data and desired output, this is just the hint to make you optimize your query and omit the redundant work)

.....CODE BEFORE THIS CTE
,CurrToPrevMAP1 AS

(SELECT
    cbd.LeaseID AS [CBLeaseID]
    ,pbd.LeaseID AS [PBLeaseID]
    ,cbd.PeriodID AS [PeriodID]

FROM CurrBudgetData cbd

    INNER JOIN Lease lc ON
       lc.LeaseID = cbd.LeaseID

    INNER JOIN Asset ac ON
       ac.AssetID = lc.AssetID

    INNER JOIN Asset ap ON
       ap.SiteID = ac.SiteID

    INNER JOIN Lease lp ON
       lp.AssetID = ap.AssetID

    INNER JOIN PrevBudgetData pbd ON
       pbd.LeaseID = lp.LeaseID
       AND pbd.PeriodID = cbd.PeriodID
        GROUP BY cbd.LeaseID
               ,pbd.LeaseID 
                   ,cbd.PeriodID
    HAVING COUNT(cbd.LeaseID) = 1 -- FILTER OUT THE MULTIPLE RESIDUALS
    )
,CurrToPrevMAP7 AS


(SELECT DISTINCT
    cbd.LeaseID AS [CBLeaseID]
    ,cbd.PeriodID AS [PeriodID]

FROM CurrBudgetData cbd)

-- Final Map!!! -- Left Join unique list from current budget to mapped IDs for -- final map

,CurrToPrevMAP AS

(SELECT
    ctpm7.CBLeaseID
    ,ctpm.PBLeaseID -- WOULD BE NULL AS THESE VALUES WOULD ONLY BE FOR THOSE WHERE THERE WERE NO MULTIPLE RESIDUALS
    ,ctpm7.PeriodID 

FROM CurrToPrevMAP7 ctpm7

    LEFT JOIN CurrToPrevMAP1 ctpm ON
       ctpm7.CBLeaseID = ctpm.CBLeaseID
       AND ctpm7.PeriodID = ctpm.PeriodID)

I hope this will help your cause.

more ▼

answered Feb 06, 2012 at 07:45 AM

Usman Butt gravatar image

Usman Butt
13.9k 6 8 14

I did say I was a newcomer!

Usman your suggestion is not quite right because:

  1. I need to prioritise direct lease to lease matches
  2. The unique list needs to be on CBLeaseID and PeriodID pairs not just CBLeaseID

I'm happy to hear other suggestions on how to shorten the query, whilst maintaining the same functionality

Regarding Temp Tables this has massively improved the query performance. I put in a temp tables instead of @AOData and also to replace CurrBudgetData, PrevBudgetData, CurrToPrevMAP4 & CurrToPrevMAP. It's a bit hard to say exactly how much as it feels like caching is going on but the query time has come back down to under a minute, which is good enough for my current purposes.
Feb 06, 2012 at 08:57 AM maccas
Glad to hear that. But I still think that there could be more room for improvement. Temp tables are my preference as well but with only few thousand rows, the DECLARED tables should not be the main problem. In future, to get much faster and better help please do add table DDLs, some sample data and desired output in addition to what you have already attached. If this would have been the case in this question, you may have got a more optimized solution from this wonderful community :)
Feb 06, 2012 at 09:22 AM Usman Butt
(comments are locked)
10|1200 characters needed characters left

First of all there's no such thing as stupid questions!

Don't think of the 10 CTEs as serial execution, i.e. cte1 runs, processes into cte2 etc.. the CTEs are merely definitions of queries (or result sets, if you want to think of them that way) that are only invoked at execution time. The optimizer can and will flatten the entire query and process it in the best way it can find. Having 10 does seem a little over the norm, and it could be that you have overcomplicated the query so much that the optimizer does not have the time to find the most optimal query plan.

I would prefer to see the code, as while you have done a good job in describing it, nothing beats seeing the code! Plus we would also need DDL for the underlying tables, example data and expected results.

You could also get, and post (as an XML file), the estimated execution plan. (Ctrl-L in SSMS)
more ▼

answered Feb 05, 2012 at 06:28 PM

Kev Riley gravatar image

Kev Riley ♦♦
52.8k 47 49 76

(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

New code box

There's a new way to format code on the site - the red speech bubble logo will automatically format T-SQL for you. The original code box is still there for XML, etc. More details here.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.

Topics:

x1834
x245
x40

asked: Feb 05, 2012 at 05:46 PM

Seen: 996 times

Last Updated: Feb 06, 2012 at 11:02 AM