question

sethuraman avatar image
sethuraman asked

SQL custom cumulative sum with formulae

I am trying to achieve the below manipulated cumulative sum ![alt text][1] [1]: /storage/temp/4421-capture.jpg Have written the following SQL , however this bit of the query needs correction ELSE ( Exp(Sum(Log(Abs(NULLIF( total , 0))))) +value ) as the +value bit is not being taken in to account for consecutive aggregations. Thanks for your help!! with DTA AS ( select *, case when id = 1 then value*Multiplier else Abs(NULLIF(Multiplier,0)) end as total from testTable ) , DTB AS ( SELECT id,value,Multiplier,total, (SELECT CASE WHEN B.id = 1 THEN CONVERT(decimal(18,5), B.total) ELSE ( Exp(Sum(Log(Abs(NULLIF( total , 0))))) +value ) END FROM DTA a WHERE B.id >= A.id) as cumulativeSum FROM DTA B ) select * from DTB order by id asc
sql-server-2008sql-server-2012
capture.jpg (41.4 KiB)
3 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Oleg avatar image Oleg commented ·
@sethuraman Are the column headers correct? From the Excel formula, it appears that the header of column B is somewhat misleading because the formula suggests that for all rows except the very first one you need to ***multiply*** the result from the previous row with ***value*** and then ***add*** the column B, which is labeled as ***multiplier***(???). If this is the case then why is column B labeled as ***multiplier***? Please clarify.
2 Likes 2 ·
sethuraman avatar image sethuraman commented ·
sorry Oleg, did not respond to this comment earlier. Yes the column name does not perfectly suit the formula.
0 Likes 0 ·
Oleg avatar image Oleg commented ·
@sethuraman I am still not sure about the claim about the quirky update working while the recursive CTE not working. Both ***do use the same formula***. In a mean time if you do need the formula to actually be ***Sum = (Value of Row1 Cumulative sum) \* Multiplier at Row2 + Value at Row2*** then all you need to do is to modify the script accordingly, that is all. For example, for recursive CTE solution you can replace the text reading t.[value] * cte.cumulative + t.multiplier with the text reading cte.cumulative * t.multiplier + t.[value] This should do the trick. If the reason CTE does not work is due to the holes in the id values then I have already mentioned it in my answer, i.e. if the id values are not sequential then the query will need to be restated slightly to accommodate this scenario. This is pretty straightforward with **row\_number()**, but it does add a bit of complexity. The quirky update is the better option anyway. By the way, if it is at all possible, I do test my scripts before posting them as answers to make sure I get expected results.
0 Likes 0 ·
ThomasRushton avatar image
ThomasRushton answered
I had a play with the so-called "Quirky Update" method, albeit only using table variables, so this isn't at all supported, but seems to work and would give you a pointer to take it further: DECLARE @t TABLE (id INTEGER NOT NULL, mult FLOAT NOT NULL, val FLOAT NOT NULL, cs FLOAT null); INSERT INTO @t (id,mult,val) VALUES (1, 6.44238, 1), (2, 0, 1.05), (3, 0.12, 1), (4, 0, 1.02), (5, 0, 1), (6, 0, 1.2531), (7, 0, 1), (8, 0, 1); DECLARE @qu FLOAT; UPDATE @t SET @qu = cs = CASE WHEN ID = 1 THEN mult * val ELSE @qu * val + mult END FROM @t OPTION (MAXDOP 1); SELECT * FROM @t; Please don't use this unless you understand what's going on. And if you're going to do this use a proper table with the right indexing and TABLOCKX. To get more information about it, see the following: - https://ask.sqlservercentral.com/questions/5150/please-can-somebody-explain-how-quirky-updates-wor.html - https://www.red-gate.com/simple-talk/sql/learn-sql-server/robyn-pages-sql-server-cursor-workbench/ - http://www.sqlservercentral.com/articles/T-SQL/68467/
1 comment
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

sethuraman avatar image sethuraman commented ·
Thank You !! that works !! Did not experiment with quirky updates, due to time pressure went with creating a custom aggregate function and importing the dll in SQL. Good to know!! much appreciated .Thanks for your time
0 Likes 0 ·
Oleg avatar image
Oleg answered
The technique implemented in @ThomasRushton ♦♦ answer is superior (from performance standpoint) to any other technique to solve this kind of problem. It would be best if you could go through the links in his answer and spend some time reading the articles and posts. I am sure you will enjoy them. I know I always do, even though I already read them many times in the past. In order to accommodate the answer to your scenario, please follow these steps: when declaring the table variable, add the primary key enforced by a unique clustered index, so change the line of code in Thomas' answer to read DECLARE @t TABLE ( id INTEGER NOT NULL primary key clustered, mult FLOAT NOT NULL, val FLOAT NOT NULL, cs FLOAT null ); Then modify the insert into this table variable so that you are inserting the actual data you have: insert into @t (id,mult,val) select id, multiplier, [value] from TestTable order by id; Copy and paste the rest of the script from the answer (starting from line reading ***DECLARE @qu FLOAT;***). Just to demonstrate how cumbersome other solutions might be let's consider the recursive CTE solution. This one is limited to cover only 32,767 rows (per group of ids) at most and its' performance, just like performance of any other solution, is going to be inferior to the one with quirky update. Nevertheless, here is the script which uses recursive CTE. I am not advocating to use it, but would like to mention it in order to show that most of the time, there are multiple ways to solve any given problem, and so it is not bad to be aware of existence of other techniques which may or may not be applicable. In addition to the 32,767 limitation, the recursive CTE (as written) requires that the ID values are sequential. There are ways to deal with number which are not sequential, but this requires additional complexity to the solution. Here is the script which will work for a reasonable number of ids per group: ;with cte as ( select *, cast([value] * multiplier as decimal(18, 6)) cumulative from TestTable where id = 1 union all select t.*, cast(t.[value] * cte.cumulative + t.multiplier as decimal(18, 6)) from TestTable t inner join cte on t.id - 1 = cte.id ) select * from cte option (maxrecursion 32767); Once again, I am not advocating recursive CTE, this is just an FYI showing alternative technique, that is all. Oleg
3 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Oleg avatar image Oleg commented ·
@sethuraman This is precisely why I questioned the definitions of your columns, they do look swapped in the image attached to your question. Please have a close look at how you actually define your aggregate: **Row 2 sum** = (**Value of Row1 Cumulative sum**) \* **Value at Row 2** + **Multiplier at Row 2**, so in the image you do use multipler to add (??? If I use the same sample data as in your question, recursive CTE results match it.
1 Like 1 ·
Oleg avatar image Oleg commented ·
Please run this script and compare results with the image attached to your question: declare @t table ( id int not null, multiplier float not null, [value] float not null); insert into @t values (1, 6.44238, 1), (2, 0, 1.05), (3, 0.12, 1), (4, 0, 1.02), (5, 0, 1), (6, 0, 1.2531), (7, 0, 1), (8, 0, 1); ;with cte as ( select *, [value] * multiplier cumulative from @t where id = 1 union all select t.*, t.[value] * cte.cumulative + t.multiplier from @t t inner join cte on t.id - 1 = cte.id ) select * from cte; -- results id multiplier value cumulative --- ---------- ------- ---------- 1 6.44238 1 6.44238 2 0 1.05 6.764499 3 0.12 1 6.884499 4 0 1.02 7.02218898 5 0 1 7.02218898 6 0 1.2531 8.79950501 7 0 1 8.79950501 8 0 1 8.79950501 Thank you.
1 Like 1 ·
sethuraman avatar image sethuraman commented ·
Hi Oleg, thanks for your time, however the recursive CTE solution does not provide the required results. In row2 answer on cumulative Sum = (Value of Row1 Cumulative sum) * Multiplier at Row2 + Value at Row2. the solution does not perform a running forward total.
0 Likes 0 ·

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.