question

Scot Hauder avatar image
Scot Hauder asked

Calculate running totals

In lieu of a LAG function I often see triangular type joins to calc the running total but I'm not too impressed with the performance so I threw this together. 1,000,001 rows in 27 seconds. If anyone has ideas on how to improve this please share. I would like to have a standard way to do this.

Cheers,

Scot

;WITH cte AS (
SELECT 0 [StockID],CAST(0 AS money) [Price], CAST(0 AS money) [Running Total],1 [ID]
UNION ALL
SELECT s.StockID, s.Price, [Running Total] + ISNULL(s.Price,0), ID + 1 FROM cte
JOIN Stock s ON (s.StockID = [ID])
)
SELECT cte.StockID
,cte.Price
,[Running Total]
FROM cte WHERE StockID <> 0
OPTION (MAXRECURSION 0);
t-sql
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Jeff Moden avatar image
Jeff Moden answered

Heh... I just love tests like the one in that article, Ian. They take the, quite possibly, slowest method ever (sp_OA*) of getting file details and compare it to a CLR. Even a poorly written CLR would win there.

Do I think a CLR is good for the sort of thing demonstrated in the article? It depends... for most people, it would be a good thing. For me, it's not. I never check to see if a single file exists or what the details are for a single file... I load all of the file names and details for processing from a staging directory at once. If a file doesn't exist or has the incorrect details, it either doesn't get loaded because it's not there or it gets filtered out of the full set of files returned as you would do in any table. Writing a CLR to replace xp_GetFileDetails just doesn't make any sense to me because it's RBAR to begin with.

Please... no one mis-intepret what I just said. I'm not "anti-CLR". I just find that some people over use them because it's easier for them to write procedural code instead of set based code or they're not aware of what T-SQL can actually do. I've had some very bad personal experiences with folks writing CLRs (SQLCLR if you insist). One person I worked with couldn't figure out how to write a simple "upsert" in T-SQL and actually took the time to write one as a CLR with hardcoded table names to boot! Another wrote some to return random numbers and to do a simple modulo. Writing a CLR to get the details for a single file instead of a pattern of file names just doesn't make any sense to me any more than the original xp_GetFileDetails did.

--Jeff Moden

10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Pavel Pawlowski avatar image
Pavel Pawlowski answered
2 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Matt Whitfield avatar image Matt Whitfield ♦♦ commented ·
+1 - certainly the quirky method is good - some people will warn against it because it's undocumented. If you are worried about that, you can achieve good results with a CLR proc.
1 Like 1 ·
Ian Roke avatar image Ian Roke commented ·
It is undocumented BUT it IS also documented that it has not been wrong and has been implement in the days of SQL Server 6.5 and Sybase.
0 Likes 0 ·
Jeff Moden avatar image
Jeff Moden answered

Matt... If you're going to say something, don't tease folks. '-) When you say you can achieve good results with a CLR proc, you should provide one or at least a link to one.

10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Ian Roke avatar image
Ian Roke answered

Jeff,

Try having a look at this article called Choosing between CLR and T-SQL stored procedures: a simple benchmark which should help you with improvements to your code.

10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Mark avatar image
Mark answered
I don't know anything about this method's performance, but I do like its simplistic brilliance*. For some reason, I didn't know that an UPDATE...SET could use variables is this way. Also [this is from SQL Server Curry][1]: CREATE TABLE #TmpTable( ID int, Col1 int, Col2 int, RowTotal int, RunningTotal int) INSERT INTO #TmpTable SELECT 1, 5, 2, 0, 0 INSERT INTO #TmpTable SELECT 2, 14, 65, 0, 0 INSERT INTO #TmpTable SELECT 3, 34, 22, 0, 0 INSERT INTO #TmpTable SELECT 4, 56, 22, 0, 0 INSERT INTO #TmpTable SELECT 5, 7, 23, 0, 0 --QUERY DECLARE @rowtot int DECLARE @runtot int SET @rowtot = 0 -- set rowtotal to 0 SET @runtot = 0 -- set runningtotal to 0 UPDATE #TmpTable SET RowTotal = @rowtot, RunningTotal = @runtot, @rowtot = COALESCE(Col1, 0) + COALESCE(Col2, 0), @runtot = @runtot + @rowtot SELECT * FROM #TmpTable [I can't seem to paste in the result set with it formated correctly, any advice would be appreciated] **"Any intelligent fool can make things bigger, more complex, and more violent. It takes a touch of genius -- and a lot of courage -- to move in the opposite direction."* Albert Einstein [1]: http://www.sqlservercurry.com/2010/10/calculate-running-total-and-row-total.html
1 comment
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Scot Hauder avatar image Scot Hauder commented ·
Thanks Mark, it is a variation of the quirky update that is dependent on a couple other things the article doesn't mention. There needs to be a clustered index on the table and you need to set (maxdop 1) If the query is split into multiple streams and then rejoined you cannot be certain the streams join in order--which they often do not. Any how I was looking for a non-update solution so that there is only one operation that returns the rows with the running total
0 Likes 0 ·

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.