question

maccas avatar image
maccas asked

Query Execution Length

Hi, I'm a relative newcomer to SQL so apologies if this is stupid or obvious. I'm trying to write a query but am a bit stumped about the execution performance blowing up at what seems to me a pretty innocuous point. Rather than dump the script code out I'll lay it out in generics; I'm happy to post the exact script but I suspect it's a little long and won't really add anything to my question: 1. DECLARE and SET some variables including a Table Value Parameter (TVP) 2. INSERT INTO the TVP around 6,000 rows as an extraction from a table with 2 billion rows. This uses an index on the source table and takes about 20 secs, which I know because I'm printing to the message window so no issue here (I think). 3. DELETE FROM the TVP to strip out about 500 rows of bad data. I think this is easier than trying to JOIN to the mega table in step 2, which would be needed to extract the right data in one go. 4. A series of 10 Common Table Expressions (CTEs) to process the data in the TVP. It is on the tenth CTE that performance blows up. The tenth CTE is doing a LEFT JOIN between a preceding CTE with c. 1800 rows and 2 columns and another preceding CTE with c. 1500 rows and 3 columns. These two CTEs take about 25 secs and 45 secs to calculate respectively, the bulk of which time is inserting into the TVP each time. However when I then come to LEFT JOIN these two performance always blows up. I've cancelled the query each time it got over a couple of minutes before but I've let it run this time whilst I look on the internet. The query is still executing at 54 minutes and counting! So to my question: have you ever heard of this sort of thing? If so what could be causing it? How do I debug? And the million dollar question, how do I fix it? Any replies gratefully received. N.B. I'm running an actual execution plan on the query but can't see anything yet as the query hasn't finished executing yet. I'm SQL Server 2008 if that makes any difference. Cheers, Maccas XML Plan attached: [XMLPlan.zip][1] [1]: http://ask.sqlservercentral.com/storage/temp/31-XMLPlan.zip
sql-server-2008performancecte
XMLPlan.zip (32.5 KiB)
3 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

maccas avatar image maccas commented ·
Wow thanks for replying so quickly on a Sunday night. I do get that CTEs are evaluated & collapsed at run-time. I think I described them that way as the way I'm building up this query I'm using each one in a chain to move my manipulation along. Obviously if you know a better way to be doing this I'd be delighted to hear / learn. The code is attached below. I'm not sure you need much example data as the TVP defines the shape of the data I'm working with. AccountOutput is the monster table with a bit over 2 billion rows. Tips on on reducing the steps involved obvious welcomed but this is above and beyond the sort of help I was expecting. I use NOLOCKS as this query can hit a production database database during work hours, I don't care about dirty reads and cannot afford to lock up the DB for any length of time.
-- Use NOLOCK everywhere
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;

-- ###################################################
-- Declare and set variables
-- ###################################################

DECLARE @CurrBudScen INT;
DECLARE @PrevBudScen INT;
DECLARE @OutputMsg VARCHAR(200);
DECLARE @AOData TABLE
			(AccountID INT NOT NULL
			,LeaseID INT NOT NULL
			,ScenarioID INT NOT NULL
			,PeriodID INT NOT NULL
			,Value DECIMAL(12,2) NOT NULL);

SET @CurrBudScen = 6410;
SET @PrevBudScen = 4757;

-- ###################################################
-- Get Data from AccountOutput
-- ###################################################

-- Current Budget
-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

SET @OutputMsg =  CONVERT(VARCHAR(30), GETDATE(), 8) + ' - Get Data from AccountOutput: Current Budget'
PRINT @OutputMsg

INSERT INTO @AOData

	SELECT
		ao.AccountID
		,ao.EntityID AS [LeaseID]
		,ao.ScenarioID
		,ao.PeriodID
		,ao.Value
		
	FROM AccountOutput ao

	WHERE
		ao.AccountID IN (82,83) -- New and Previous Rent on Rent Review
		AND ao.ScenarioID = @CurrBudScen
		AND ao.IsComparisonCopy = 0
		AND ao.IsRefScenario = 0;

-- (3431 row(s) affected)

-- Previous Budget
-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

SET @OutputMsg =  CONVERT(VARCHAR(30), GETDATE(), 8) + ' - Get Data from AccountOutput: Previous Budget'
PRINT @OutputMsg

INSERT INTO @AOData

	SELECT
		ao.AccountID
		,ao.EntityID AS [LeaseID]
		,ao.ScenarioID
		,ao.PeriodID
		,ao.Value
		
	FROM AccountOutput ao

	WHERE
		ao.AccountID IN (82,83) -- New and Previous Rent on Rent Review
		AND ao.ScenarioID = @PrevBudScen
		AND ao.IsComparisonCopy = 0
		AND ao.IsRefScenario = 0;

-- (3549 row(s) affected)

-- Remove Orphaned Data
-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

SET @OutputMsg =  CONVERT(VARCHAR(30), GETDATE(), 8) + ' - Get Data from AccountOutput: Remove Orphaned Data'
PRINT @OutputMsg

DELETE aod

FROM @AOData aod

	LEFT JOIN Lease l ON
		l.LeaseID = aod.LeaseID
	
	LEFT JOIN LeaseScenarioDetail lsd ON
		lsd.LeaseID = l.LeaseID
		AND lsd.ScenarioID = aod.ScenarioID

	LEFT JOIN Asset a ON
		a.AssetID = l.AssetID
	
	LEFT JOIN AssetScenarioDetail asd ON
		asd.AssetID = a.AssetID
		AND asd.ScenarioID = aod.ScenarioID

	LEFT JOIN [Site] s ON
		s.SiteID = a.SiteID
	
	LEFT JOIN SiteScenarioDetail ssd ON
		ssd.SiteID = s.SiteID
		AND ssd.ScenarioID = aod.ScenarioID

WHERE
	lsd.RecordStatus = 316
	OR asd.RecordStatus = 316
	OR ssd.RecordStatus = 316
	OR lsd.RecordStatus IS NULL
	OR asd.RecordStatus IS NULL
	OR ssd.RecordStatus IS NULL;

-- (498 row(s) affected)

-- ###################################################
-- CTEs
-- ###################################################

-- Move new rent on review and previous rent on review
-- into columns on th same rwo for current budget 
-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

WITH CurrBudgetData AS

	(SELECT
		aod1.LeaseID
		,aod1.PeriodID
		,aod1.Value AS [NewRentOnReview]
		,aod2.Value AS [PrevRentOnReview]
		
	FROM @AOData aod1
		
		INNER JOIN @AOData aod2 ON
			aod2.LeaseID = aod1.LeaseID
			AND aod2.ScenarioID = aod1.ScenarioID
			AND aod2.PeriodID = aod1.PeriodID
			and aod2.AccountID = 83 --Previous Rent on Rent Review
	WHERE
		aod1.AccountID = 82 --New Rent on Rent Review
		AND aod1.ScenarioID = @CurrBudScen)

-- Move new rent on review and previous rent on review
-- into columns on th same rwo for previous budget 
-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

,PrevBudgetData AS

	(SELECT
		aod1.LeaseID
		,aod1.PeriodID
		,aod1.Value AS [NewRentOnReview]
		,aod2.Value AS [PrevRentOnReview]
		
	FROM @AOData aod1
		
		INNER JOIN @AOData aod2 ON
			aod2.LeaseID = aod1.LeaseID
			AND aod2.ScenarioID = aod1.ScenarioID
			AND aod2.PeriodID = aod1.PeriodID
			and aod2.AccountID = 83 --Previous Rent on Rent Review
	WHERE
		aod1.AccountID = 82 --New Rent on Rent Review
		AND aod1.ScenarioID = @PrevBudScen)

-- Create map between current budget and previous budget
-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

-- Match rent reviews in same month under same SiteID 

,CurrToPrevMAP1 AS

	(SELECT
		cbd.LeaseID AS [CBLeaseID]
		,pbd.LeaseID AS [PBLeaseID]
		,cbd.PeriodID AS [PeriodID]
		
	FROM CurrBudgetData cbd
		
		INNER JOIN Lease lc ON
			lc.LeaseID = cbd.LeaseID

		INNER JOIN Asset ac ON
			ac.AssetID = lc.AssetID
		
		INNER JOIN Asset ap ON
			ap.SiteID = ac.SiteID
		
		INNER JOIN Lease lp ON
			lp.AssetID = ap.AssetID
		
		INNER JOIN PrevBudgetData pbd ON
			pbd.LeaseID = lp.LeaseID
			AND pbd.PeriodID = cbd.PeriodID)

-- Pick from first list those matches with same LeaseID 

,CurrToPrevMAP2 AS

	(SELECT
		ctpm.CBLeaseID
		,ctpm.PBLeaseID
		,ctpm.PeriodID
		
	FROM CurrToPrevMAP1 ctpm
	
	WHERE ctpm.CBLeaseID = ctpm.PBLeaseID)

-- Pick from first list those matches with different LeaseID
-- that also don't have a same leaseID match (this will be demmed 
-- to over-rule the non-matching LeaseID initial match) 

,CurrToPrevMAP3 AS

	(SELECT
		ctpm.CBLeaseID
		,ctpm.PBLeaseID
		,ctpm.PeriodID
		
	FROM CurrToPrevMAP1 ctpm
	
	WHERE
		ctpm.CBLeaseID <> ctpm.PBLeaseID
		AND ctpm.CBLeaseID NOT IN (SELECT CBLeaseID FROM CurrToPrevMAP2))

-- Aggregate two sub-selections above
-- i.e. remove non-matching LeaseID initial matches where same
-- LeaseID matches also found under the same site

,CurrToPrevMAP4 AS

	(SELECT * FROM CurrToPrevMAP2
		
	UNION ALL
	
	SELECT * FROM CurrToPrevMAP3)

-- Get list of all insatnces of unique CBLeaseID & PeriodID combos

,CurrToPrevMAP5 AS
	
	(SELECT DISTINCT CBLeaseID, PeriodID
	FROM CurrToPrevMAP4 AS A
	WHERE
        (SELECT COUNT(CBLeaseID) 
         FROM CurrToPrevMAP4 AS B
         WHERE B.CBLeaseID = A.CBLeaseID
         AND B.PeriodID = A.PeriodID) = 1)

-- Use above unique list to NULL out any residual mutliple
-- matches on the grounds that these are too difficult 
-- to programatically match 

,CurrToPrevMAP6 AS
	
	(SELECT DISTINCT
		ctpm4.CBLeaseID
		,CASE WHEN ctpm5.CBLeaseID IS NULL THEN NULL ELSE ctpm4.PBLeaseID END AS [PBLeaseID]
		,ctpm4.PeriodID
	FROM CurrToPrevMAP4 ctpm4
		LEFT JOIN CurrToPrevMAP5 ctpm5 ON
			ctpm5.CBLeaseID = ctpm4.CBLeaseID
			AND ctpm5.PeriodID = ctpm4.PeriodID)     

-- Get distinct Lease, PeriodID list from current budget

,CurrToPrevMAP7 AS

	(SELECT DISTINCT
		cbd.LeaseID AS [CBLeaseID]
		,cbd.PeriodID AS [PeriodID]
		
	FROM CurrBudgetData cbd)

-- Final Map!!!
-- Left Join unique list from current budget to mapped IDs for
-- final map

,CurrToPrevMAP AS

	(SELECT
		ctpm7.CBLeaseID
		,ctpm6.PBLeaseID
		,ctpm7.PeriodID
		
	FROM CurrToPrevMAP7 ctpm7
	
		LEFT JOIN CurrToPrevMAP6 ctpm6 ON
			ctpm7.CBLeaseID = ctpm6.CBLeaseID
			AND ctpm7.PeriodID = ctpm6.PeriodID)
			
-- ###################################################
-- Main Query
-- ###################################################

-- takes 45 seconds
SELECT * FROM CurrToPrevMAP6;

-- takes 25 seconds
SELECT * FROM CurrToPrevMAP7;

-- takes over two hours and counting?
SELECT * FROM CurrToPrevMAP;

-- Reset default LOCK behaviour
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
0 Likes 0 ·
maccas avatar image maccas commented ·
The line breaks in that script got all messed up. I just copied and pasted from SSMS, is there something else I should be doing? I tried attaching the XML actual execution plan from the query, which finally finished running in an epic 2 hours 34 mins, but it exceeds the attachment file size limit. I'll try uploading the xml file elsewhere on the internet and posting a link. Maccas
0 Likes 0 ·
maccas avatar image maccas commented ·
Here's the execution plan: http://www.macadie.co.uk/files/ExecutionPlan.xml
0 Likes 0 ·
Pavel Pawlowski avatar image
Pavel Pawlowski answered
You can post here at least Estimated execution plan which you can receive even without executing the query as @Kev Riley mentioned The issue with TVP and table variables in general is, that there are no statistics generated for them in SQL server and as a result, query optimizer always evaluates the table variables as if they have single record. So if you have for example a few thousands of records in the table variable and joining it with multi-bilion table, it can be a real performance killer if SQL server chooses wrong query plan because of missing statistics. In this case it could be better use regular temp table for which statistics are generated and use this in join. This is one of possible scenarios. But better would be to see a real plan (at least the estimated one).
1 comment
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Pavel Pawlowski avatar image Pavel Pawlowski commented ·
After looking on the query plan [XMLPlan.zip][1], it's really crazy and better to re-work the query. Eg. at least split it and put intermediate result into some temp table and then continue. As it is written now, the JOINS to big tables are replicated by the subsequent CTEs. What more as I have mentioned the missing statistics on the Table Variable are also big problem in this situation, so I suggest to replace it by regular temp table at least too. [1]: http://ask.sqlservercentral.com/storage/attachments/31-XMLPlan.zip
0 Likes 0 ·
Kev Riley avatar image
Kev Riley answered
First of all there's no such thing as stupid questions! Don't think of the 10 CTEs as serial execution, i.e. cte1 runs, processes into cte2 etc.. the CTEs are merely definitions of queries (or result sets, if you want to think of them that way) that are only invoked at execution time. The optimizer can and will flatten the entire query and process it in the best way it can find. Having 10 does seem a little over the norm, and it could be that you have overcomplicated the query so much that the optimizer does not have the time to find the most optimal query plan. I would prefer to see the code, as while you have done a good job in describing it, nothing beats seeing the code! Plus we would also need DDL for the underlying tables, example data and expected results. You could also get, and post (as an XML file), the estimated execution plan. (Ctrl-L in SSMS)
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Usman Butt avatar image
Usman Butt answered

Oh my GOD. @maccas Seems like while trying to simplify the logic by breaking into pieces (With use of CTEs), you have over-complicated the query. There is redundant work done in the script

For e.g.

,CurrToPrevMAP1 AS

(SELECT
    cbd.LeaseID AS [CBLeaseID]
    ,pbd.LeaseID AS [PBLeaseID]
    ,cbd.PeriodID AS [PeriodID]

FROM CurrBudgetData cbd

    INNER JOIN Lease lc ON
       lc.LeaseID = cbd.LeaseID

    INNER JOIN Asset ac ON
       ac.AssetID = lc.AssetID

    INNER JOIN Asset ap ON
       ap.SiteID = ac.SiteID

    INNER JOIN Lease lp ON
       lp.AssetID = ap.AssetID

    INNER JOIN PrevBudgetData pbd ON
       pbd.LeaseID = lp.LeaseID
       AND pbd.PeriodID = cbd.PeriodID)

-- Pick from first list those matches with same LeaseID

,CurrToPrevMAP2 AS

(SELECT
    ctpm.CBLeaseID
    ,ctpm.PBLeaseID
    ,ctpm.PeriodID

FROM CurrToPrevMAP1 ctpm

WHERE ctpm.CBLeaseID = ctpm.PBLeaseID)

-- Pick from first list those matches with different LeaseID -- that also don't have a same leaseID match (this will be demmed -- to over-rule the non-matching LeaseID initial match)

,CurrToPrevMAP3 AS

(SELECT
    ctpm.CBLeaseID
    ,ctpm.PBLeaseID
    ,ctpm.PeriodID

FROM CurrToPrevMAP1 ctpm

WHERE
    ctpm.CBLeaseID <> ctpm.PBLeaseID
    AND ctpm.CBLeaseID NOT IN (SELECT CBLeaseID FROM CurrToPrevMAP2))

-- Aggregate two sub-selections above -- i.e. remove non-matching LeaseID initial matches where same -- LeaseID matches also found under the same site

,CurrToPrevMAP4 AS

(SELECT * FROM CurrToPrevMAP2

UNION ALL

SELECT * FROM CurrToPrevMAP3)

could be replaced by one CTE i.e. CurrToPrevMAP1 itself

As far as the rest of the logic is concerned, I think what you need is that PBLeaseID should be NULL in case the there are mutliple residuals i.e. COUNT(LeaseID) > 1. So your FINAL CTES should be (Since, I do not have the sample data and desired output, this is just the hint to make you optimize your query and omit the redundant work)

.....CODE BEFORE THIS CTE
,CurrToPrevMAP1 AS

(SELECT
    cbd.LeaseID AS [CBLeaseID]
    ,pbd.LeaseID AS [PBLeaseID]
    ,cbd.PeriodID AS [PeriodID]

FROM CurrBudgetData cbd

    INNER JOIN Lease lc ON
       lc.LeaseID = cbd.LeaseID

    INNER JOIN Asset ac ON
       ac.AssetID = lc.AssetID

    INNER JOIN Asset ap ON
       ap.SiteID = ac.SiteID

    INNER JOIN Lease lp ON
       lp.AssetID = ap.AssetID

    INNER JOIN PrevBudgetData pbd ON
       pbd.LeaseID = lp.LeaseID
       AND pbd.PeriodID = cbd.PeriodID
        GROUP BY cbd.LeaseID
               ,pbd.LeaseID 
                   ,cbd.PeriodID
    HAVING COUNT(cbd.LeaseID) = 1 -- FILTER OUT THE MULTIPLE RESIDUALS
    )
,CurrToPrevMAP7 AS


(SELECT DISTINCT
    cbd.LeaseID AS [CBLeaseID]
    ,cbd.PeriodID AS [PeriodID]

FROM CurrBudgetData cbd)

-- Final Map!!! -- Left Join unique list from current budget to mapped IDs for -- final map

,CurrToPrevMAP AS

(SELECT
    ctpm7.CBLeaseID
    ,ctpm.PBLeaseID -- WOULD BE NULL AS THESE VALUES WOULD ONLY BE FOR THOSE WHERE THERE WERE NO MULTIPLE RESIDUALS
    ,ctpm7.PeriodID 

FROM CurrToPrevMAP7 ctpm7

    LEFT JOIN CurrToPrevMAP1 ctpm ON
       ctpm7.CBLeaseID = ctpm.CBLeaseID
       AND ctpm7.PeriodID = ctpm.PeriodID)

I hope this will help your cause.

2 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

maccas avatar image maccas commented ·
I did say I was a newcomer! Usman your suggestion is not quite right because: 1. I need to prioritise direct lease to lease matches 2. The unique list needs to be on CBLeaseID and PeriodID pairs not just CBLeaseID I'm happy to hear other suggestions on how to shorten the query, whilst maintaining the same functionality Regarding Temp Tables this has massively improved the query performance. I put in a temp tables instead of @AOData and also to replace CurrBudgetData, PrevBudgetData, CurrToPrevMAP4 & CurrToPrevMAP. It's a bit hard to say exactly how much as it feels like caching is going on but the query time has come back down to under a minute, which is good enough for my current purposes.
0 Likes 0 ·
Usman Butt avatar image Usman Butt commented ·
Glad to hear that. But I still think that there could be more room for improvement. Temp tables are my preference as well but with only few thousand rows, the DECLARED tables should not be the main problem. In future, to get much faster and better help please do add table DDLs, some sample data and desired output in addition to what you have already attached. If this would have been the case in this question, you may have got a more optimized solution from this wonderful community :)
0 Likes 0 ·
Grant Fritchey avatar image
Grant Fritchey answered
That is one seriously hairy execution plan. I'm not going to try to pick it all apart, but it looks like you're falling into a classic hole. You're trying to use row by row logic on TSQL, which is a set-based processing system. Sometimes you do need to break data down into steps as you've done, but most of the time, this is an indication that you have some flaws in the data structure which need fixing or that you haven't tried enough using the set-based methods within TSQL. My suggestion, take a step back and reassess what data you need to get out of this query and then start exploring using a single statement to eliminate it. For example, just that first step after you load the table valued function, deleting data. That means you removed information that you just loaded. You made the assumption that joining to a table would be slower. It might be, but, with propery indexes and good code, joins are probably going to be substantially faster, not slower. Explore the rest of the code thinking along those lines.
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.