x

The ‘FIFO Stock Inventory’ SQL Problem

Phil Factor SQL Speed Phreak Competition: No 2

alt text

This competition is now over, but the winner, Dave, got an Amazon Voucher for $60, and the privilege of being able to display the 'Phil Factor SQL Speed Phreak' award on their own site

It was quite a struggle with some close competition from many of those who participated in this competition. However, Dave came up with a clever solution that produced an FIFO calculation from a million rows in a just a few seconds. The third Phil Factor Speed Phreak competition will soon be held on here. Watch out!


(here is the original preamble.)

This competition is to calculate current items in stock and current stock value in FIFO order.
I'll tell you the business rules for the algorithm and provide a sample cursor-based routine that generates the correct result, but slowly (about 40 minutes).

I have seen many different algorithms to do this and most of them involve a cursor. Can it be done more quickly without, or even with, a cursor? In other words, what is the fastest way in SQL Server (any version) to provide this stock inventory report?
It is a reasonable request. We have a stock transaction list (1,000,001 records, 17.6MB zipped) with 15,002 unique articles and we need to do a stock inventory report that gives the current breakdown of the ArticleID, the number of items in stock and the current stock value according to FIFO rules. The list should be in ArticleID order. The sample stock transaction list will include two running totals so you can check the results of your routine. They are NOT part of the competition, and you are not allowed to use them in your calculation.

Beware that the sample file includes the two extra columns; CurrentItems (INT) and CurrentValue (MONEY) !
If you want a smaller subset to test on, import all records and remove all articles but two or three. These are the samples for which all suggestions will be measured with.

The table is in this form (simplified from the way we'd do it in a real system of course).
In this made-up but realistic problem, you have an existing table and database design which you can’t do anything about. (we can all relate to that!) You have an existing cursor-based solution that is taking several minutes to run. (Yes, we all relate to that). The database is performing badly, and you need to take effective steps to remedy this. The only thing you can do is to come up with a better-performing routine. Redesigning the database isn’t an option very often, in the real world, because this requires team sign-in. This is a competition based on the real, sometimes imperfect, world: not an exposition of database design. The point is that we are faced with designs like this and we have to heal them. The competition is all about making stuff go faster.

CREATE TABLE    dbo.Stock
                (
                    StockID INT IDENTITY(1, 1) NOT NULL,
                    ArticleID SMALLINT NOT NULL,
                    TranDate DATETIME NOT NULL,
                    TranCode VARCHAR(3) NOT NULL,
                    Items INT NOT NULL,
                    Price MONEY NULL,
                    CONSTRAINT [PK_Stock] PRIMARY KEY CLUSTERED 
                    (
                        StockID ASC
                    )
                )  

CREATE NONCLUSTERED INDEX IX_Input ON dbo.Stock (TranCode, ArticleID)
--INCLUDE (TranDate, Items, Price) -- Remove comment for SQL Server 2005 and later
--WHERE TranCode IN ('IN', 'RET')   -- Remove comment for SQL Server 2008

CREATE NONCLUSTERED INDEX IX_Output ON dbo.Stock (TranCode, ArticleID)
-- INCLUDE (TranDate, Items) -- Remove comment for SQL Server 2005 and later
--WHERE TranCode = 'OUT'  -- Remove comment for SQL Server 2008

You are welcome to change the two nonclustered indexes to suit your solution. You can download the complete sample data here. I have an idea of my own of the way to do this but I don’t know if it is the fastest.

Explanation of FIFO rules (example, abbreviated)

StockID ArticleID TranDate TranCode Items  Price CurrentItems CurrentValue
   4567     10000 10:45:07 IN         738 245.94          738   181,503.72
  21628     10000 12:05:25 OUT        600                 138    33,939.72
  22571     10000 14:39:27 IN          62 199.95          200    46,336.62
  30263     10000 16:14:13 OUT        165                  35     6,998.25
  42090     10000 18:18:58 RET          5                  40     7,998.00
  58143     10000 20:18:54 IN         500 135.91          540    75,953.00

a) First IN add 738 items (each $245.94) to the stock, for a total of $181,503.72
b) Then we take out 600 items (each 245.94) from the stock, leaving a total of $33,939.72
c) Then we insert 62 items (each 199.95) to the stock, for a total of $46,336.62
d) Then we take out 165 items (138 each 245.94 and 27 each 199.95), leaving a total of $6,998.25
e) Then we return 5 items. We can’t track at which price we took them out; so all returns are priced at the price of the latest ones inserted before the return. Even if there should be items left for the price of 245.94, the returned items are valued for 199.95. After the return, the current stock value is $7,998.00
f) The final insert adds $67,995.00 to the stock value, for a total of $75,953.00

As mentioned before, the CurrentItems and CurrentValue columns in the sample data are only included for you to validate your routines.

Here are some guidelines for your entries:

1) Include a header in your suggestion. Make sure your name and date is present.
2) Include an edition number. First edition is 1. If you later improve your current suggestion post it again as version 2. Example: “Peso 1” and if improved, “Peso 1b”, “Peso 1c” etc.
3) If you are trying a new algorithm, change the edition to “Peso 2”. If you improve this algorithm, change the version to “Peso 2b”, “Peso 2c” etc. This will save Phil hours of work in the test harness!
4) If you create a temp table, make sure you delete it in the script.
5) Keep the order of columns in output as ArticleID, CurrentItems, CurrentValue

I will allow you to use an existing tally number table (make sure it starts with 0). You can use any kind of object for this competition, except SQLCLR. If you are using a fixed tally number table, it has to be named dbo.TallyNumbers and the column named Number.
The time taken for their creation will not be included in the timings. The time measured is the “Main” script/procedure. If you want to call sub-procedures, go ahead.

The winner will be amongst the tied fastest entrants (generally there is a group of these) and it will be the one with the highest number of votes. We'll announce the winner in three week's time on 16th November.

For a starter, here is a common cursor based solution that you will find in production in many places.

CREATE TABLE    #Work
                (
                    RowID INT IDENTITY(1, 1) PRIMARY KEY CLUSTERED,
                    Price MONEY
                )  

DECLARE @ArticleID INT = 1,
        @PrevID INT = 0,
        @TranCode VARCHAR(3),
        @Items INT,
        @Price MONEY,
        @Loop INT = 0,
        @StockID INT,
        @LatestPrice MONEY,
        @Total INT = (SELECT COUNT(*) FROM dbo.Stock)

DECLARE curYak CURSOR FORWARD_ONLY FOR
                SELECT      ArticleID,
                            TranCode,
                            Items,
                            Price,
                            StockID
                FROM        dbo.Stock
                ORDER BY    ArticleID,
                            TranDate

OPEN    curYak

FETCH   NEXT
FROM    curYak
INTO    @ArticleID,
        @TranCode,
        @Items,
        @Price,
        @StockID

WHILE @@FETCH_STATUS = 0
    BEGIN
        IF @ArticleID > @PrevID
            BEGIN
                TRUNCATE TABLE  #Work

                SET @LatestPrice = NULL
            END

        IF @TranCode = 'IN'
            BEGIN
                INSERT  #Work
                        (
                            Price
                        )
                SELECT  @Price
                FROM    dbo.TallyNumbers
                WHERE   Number < @Items

                SET     @LatestPrice = @Price
            END

        IF @TranCode = 'RET'
            INSERT  #Work
                    (
                        Price
                    )
            SELECT  @LatestPrice
            FROM    dbo.TallyNumbers
            WHERE   Number < @Items

        IF @TranCode = 'OUT'
            DELETE  w
            FROM    (
                        SELECT      TOP(@Items)
                                    RowID
                        FROM        #Work
                        ORDER BY    RowID
                    ) AS w

        UPDATE      s
        SET         s.CurrentItems = w.CurrentItems,
                    s.CurrentValue = COALESCE(w.CurrentValue, 0)
        FROM        dbo.Stock AS s
        INNER JOIN  (
                        SELECT  COUNT(*) AS CurrentItems,
                                SUM(Price) AS CurrentValue
                        FROM    #Work
                    ) AS w ON s.StockID = @StockID

        SELECT  @PrevID = @ArticleID,
                @Loop += 1

        IF @Loop % 1000 = 0
            RAISERROR('Now updating record %d of %d.', 10, 1, @Loop, @Total) WITH NOWAIT

        FETCH   NEXT
        FROM    curYak
        INTO    @ArticleID,
                @TranCode,
                @Items,
                @Price,
                @StockID
    END

DROP TABLE  #Work

CLOSE       curYak
DEALLOCATE  curYak

The above is already done. Here is the code to produce the final resultset.

SELECT      ArticleID,
            CurrentItems,
            CurrentValue
FROM        (
                SELECT  ArticleID,
                        CurrentItems,
                        CurrentValue,
                        ROW_NUMBER() OVER (PARTITION BY ArticleID ORDER BY TranDate DESC) AS recID
                FROM    dbo.Stock
            ) AS d
WHERE       recID = 1
ORDER BY    ArticleID

Good luck to you all!

Peter Larsson

more ▼

asked Oct 23 '09 at 03:56 PM in Default

Peso gravatar image

Peso
1.6k 5 6 8

Was there any particular reason you disallowed CLR functionality?

On an initial look, this would be the ideal scenario in which to use a CLR type...
Oct 23 '09 at 04:34 PM Peso
Two things had us made this decision: 1) We have tried CLR before and there was no performance gain. 2) To test and verify, poster need to disclode the source code for the suggestion. But if you have a CLR routine that runs on 15 seconds or less, I think Phil will be interested in testing it.
Oct 23 '09 at 05:08 PM Peso
Questions: 1) Is the Stock Transaction List referred to the one that will be used for Final Evaluation? If not, can you describe the differences? 2) What is the configuration that entries will be (finally) evaluated on?
Oct 23 '09 at 05:09 PM RBarryYoung
Yes, the downloadable file is the data against all suggestions will be measured. Phil will drop or rename the two last columns so that no suggestion can use them. I guess Phil will be using same test harness as last time with 'Subscription List' competition.
Oct 23 '09 at 05:16 PM Peso
Those filtered indexes will only work on SQL Server 2008, right?
Oct 23 '09 at 05:46 PM RBarryYoung
(comments are locked)
10|1200 characters needed characters left

28 answers: sort voted first

I had some help with this one from a friend (thanks Mark!). it's just at a second on my machine. Let me know if I did anything wrong

/* Matt Clingan 11/5/2009 Phil Factor Challenge Entry Edition 1 */

--=-=-=-=-=-=-=-=-=-=-=-=-=-=-= --cleanup --=--=-=-=-=-=-=-=-=-=-=-=-=-=-

IF not EXISTS (SELECT 1 FROM sys.indexes WHERE object_id = OBJECT_ID(N'[dbo].[Stock]') AND name = 'ix_StockWork_1') CREATE INDEX ix_StockWork_1 on stock(ArticleID, stockid desc,trancode,price,items) IF exists (select 1 from tempdb.sys.tables where name like '%#tt1%') DROP TABLE #tt1 IF exists (select 1 from tempdb.sys.tables where name like '%#tTemp%') DROP TABLE #tTemp IF exists (select 1 from tempdb.sys.tables where name like '%#fifofinal%') DROP TABLE #FIFOFINAL IF exists (select 1 from tempdb.sys.tables where name like '%#f%') DROP TABLE #f

--=-=-=-=-=-=-=-=-=-=-=-=-=-=-= --temp tables --=--=-=-=-=-=-=-=-=-=-=-=-=-=-

create table #tt1 (articleid smallint,stockid int) create table #tTemp(articleid smallint,stockid int, items int,price money) create table #f(articleid smallint, finalitems int) create table #fifoFinal (articleid smallint, stockid int, itemcount int, finalitems int, finalDollars money)

insert into #tt1 select articleid,MAX(stockid) as stockid from stock where trancode='IN'
group by articleid

insert into #f select articleid,SUM(case when trancode = 'OUT' then -items else items end) as finalitems from stock group by articleid

insert into #fifofinal select #f.articleid ,#tt1.stockid ,finalitems-sum(items) ,finalitems, case when sum(items)>finalitems then finalitems else sum(items) end * max(price) from #f left join #tt1 on #f.articleid = #tt1.articleid join stock s on #tt1.articleid = s.articleid and #tt1.stockid <=s.stockid and s.trancode in ('IN','RET') group by #f.articleid,#tt1.stockid,finalitems

select * from #fifofinal
more ▼

answered Nov 05 '09 at 01:08 PM

Matt 2 gravatar image

Matt 2
1 1

Impressive! I get wrong results from this for some rows, but I don't think it's going to be difficult to fix. Impressive.
Nov 05 '09 at 02:32 PM Gianluca Sartori
See for example ArticleID 24986.
Nov 05 '09 at 03:43 PM Peso

And change last SELECT statement to

select articleid, finalitems, finaldollars from #fifofinal order by articleid

Also change all cleanup to last statement so that I can run this 10 consecutive times. No nees to drop and recreate index.
Nov 05 '09 at 03:44 PM Peso
I have investigated the query, and the culprit is "end * max(s.price)". The price is not the max price, it is the latest price for an IN transaction. The max price may be the earliest price if price value is falling.
Nov 05 '09 at 04:01 PM Peso
(comments are locked)
10|1200 characters needed characters left

I have a question regarding the rule for Returns. As follows:

You explain that because we can't track which original sales were being returned, therefore we always price returns at the price of the IN immediately preceding the return. However, this does not make sense in the case when there were no OUTS after the IN which immediately precedes the RET. In such a case, because we know that the RET could not have been of the items that just came IN. The rule should be "RET is priced at the price of the IN immediately preceding the OUT immediately preceding the RET."

This way RETs can be consistently ignored in the calculations, because only real INs count. And it is consistent with FIFO.

please advise

more ▼

answered Nov 06 '09 at 03:03 AM

Moshe gravatar image

Moshe
2

It does make sense for the economists. They want their stock to be as accurate values as possible. If we encounter a customer return ("RET"), we want the stock to be updated with the latest known value the the article used in the return. And since we cannot track exact order from which the article was taken out at, we use the latest known price available. For example, if price is increasing, we want the stock to be valued at the higher (latest) price, not the price originally assigned, and if the price is decreasing, we want the same thing, the latest known price for that article.
Nov 06 '09 at 04:31 AM Peso
Adn regarding your OUT statement. The OUTtake can span several IN transactions. Which price do you suggest we use, if an OUT drains 5,6 or more IN's?
Nov 06 '09 at 04:32 AM Peso
(comments are locked)
10|1200 characters needed characters left

Even though under FIFO, current value is tracked at the more recent purchase prices, the value of goods already in inventory is not adjusted up or down just because another purchase is made at a different price. The premise of FIFO is not to track the value of current inventory at the latest, or most accurate basis. This is not market based cost tracking, or LCM. It's one of two accounting constructs designed for an environment where a) inventory value is tracked at cost, and b) subtractions from inventory cannot be tracked back to the exact point and price of entry. In such an environment, the arbitrary, yet consistent, assumption that First in First Out provides a method to consistently track value in an inherently inconsistent and unknown environment.

The problem with always using the latest IN price for returns is that in cases where no OUTS occurred after the latest IN, or where the number of items returned exceeds the number of OUT items that sold after the latest IN, the practice violates condition a) above - that value is tracked at cost.

Since RETs can only happen if there are OUTS, it makes sense to treat RETs as cancellations of the OUTS. And just like OUTS can span several INs (that's a premise of the rule), RETs can span, or cancel, several OUTs.

I'm not hung up over it, I can see how treating a RET as another purchase at the most recent price is consistent as well, but that complicates the query:)

more ▼

answered Nov 06 '09 at 05:29 AM

Moshe gravatar image

Moshe
2

(comments are locked)
10|1200 characters needed characters left

Here's my entry. I'm calculating running count in reverse order and grouping INs and RETs based on their rank in the stock. Runs in 1.4 sec on my crappy laptop and requires "just" 53933 scans.

-- FIFO Stock Inventory -- Gianluca 1a -- 06 Nov 2009

CREATE NONCLUSTERED INDEX IX_Stock_GS1 ON STOCK (ArticleID, StockID, TranCode) INCLUDE(Items)

;WITH RunItems(StockId, ArticleId, Items, Run_Items) AS ( SELECT StockId, ArticleId, Items, Run_Items = ( SELECT SUM(Items) FROM Stock WHERE ArticleId = S.ArticleId AND StockId >= S.StockId AND TranCode IN ('IN','RET') ) FROM Stock S WITH(TABLOCKX) WHERE TranCode = 'IN' ), FilteredStock (ArticleId, DiffItems, StockId, ResidualItems) AS ( SELECT * FROM ( SELECT ArticleId, DiffItems = SUM(CASE Trancode WHEN 'OUT' THEN -A.Items ELSE A.Items END) FROM Stock AS A WITH(TABLOCKX) GROUP BY ArticleId ) AS T CROSS APPLY ( SELECT TOP 1 StockId, ResidualItems = DiffItems - R.Run_Items + R.Items FROM RunItems AS R WHERE DiffItems <= Run_Items AND ArticleId = T.ArticleId ORDER BY R.StockId DESC ) AS B ), NumberedStock (StockId, ArticleId, TranCode, Items, Price, RN2, RN) AS ( SELECT *, RN = ROW_NUMBER() OVER ( PARTITION BY ArticleId ORDER BY StockId ) FROM (
SELECT StockID, ArticleID, TranCode, Items, Price, RN2 = ROW_NUMBER() OVER ( PARTITION BY ArticleId ORDER BY StockId ) FROM Stock AS S WITH(TABLOCKX) WHERE Trancode = 'IN' UNION ALL SELECT StockID, ArticleID, TranCode, Items, Price, RN2 = ROW_NUMBER() OVER ( PARTITION BY ArticleId ORDER BY StockId ) FROM Stock AS S WITH(TABLOCKX) WHERE Trancode = 'RET' ) AS A ), SummaryStock AS ( SELECT MinStockId = MIN(StockId), ArticleId, Items = SUM(PriceItems), Price = SUM(ISNULL(Price, 0)), MIN(DiffItems) AS DiffItems FROM ( SELECT StockId = N.StockId, ArticleId = N.ArticleId, N.Price, RN = CASE N.Trancode WHEN 'IN' THEN RN2 ELSE RN - RN2 END, DiffItems, ResidualItems, PriceItems = CASE WHEN N.StockId = B.StockID THEN ResidualItems ELSE N.Items * CASE WHEN N.StockId > B.StockId THEN 1 ELSE 0 END END FROM NumberedStock AS N CROSS APPLY ( SELECT * FROM FilteredStock WHERE ArticleId = N.ArticleId AND StockId <= N.StockId ) AS B ) AS S GROUP BY ArticleId, RN ) SELECT ArticleId, CurrentItems = MAX(DiffItems), CurrentValue = SUM(Items * Price) FROM SummaryStock GROUP BY ArticleId ORDER BY ArticleId
more ▼

answered Nov 06 '09 at 06:15 AM

Gianluca Sartori gravatar image

Gianluca Sartori
228 1 4

Seems promising. I will test it when I have access to a clean test machine.
Nov 06 '09 at 07:48 AM Peso
I get between 126k to 169k reads... We'll see what Phil get on his test harness
Nov 06 '09 at 02:07 PM Peso
Uhmm... strange. On my laptop it runs faster than Dave's query. Well, I'm glad to see it's a good entry anyway.
Nov 06 '09 at 02:25 PM Gianluca Sartori
For last competition, the Subscription List SQL Problem, we saw a huge difference for suggestions run on laptops vs servers, even if they both had same version number for both instances.
Nov 06 '09 at 03:02 PM Peso
Gianluca, for Phil to accurate test this suggestion, please comment which indexes you use besides IX_Stock_GS1. Are you using the non-clustered indexes in the competition? Or variations of those?
Nov 06 '09 at 04:08 PM Peso
(comments are locked)
10|1200 characters needed characters left

Another try, with a quirky update. It doesn't perform as fast as the others, but it's another way to solve it quite quickly.

-- FIFO Stock Inventory -- Gianluca 2a -- 06 Nov 2009

CREATE TABLE #Summary ( ArticleId int, RN tinyint, Items int, Price money, OutItems int, PRIMARY KEY CLUSTERED (ArticleId, RN) )

;WITH NumberedStock (StockId, ArticleId, TranCode, Items, Price, RN2, RN) AS ( SELECT *, RN = ROW_NUMBER() OVER ( PARTITION BY ArticleId ORDER BY StockId ) FROM ( SELECT StockID, ArticleID, TranCode, Items, Price, RN2 = ROW_NUMBER() OVER ( PARTITION BY ArticleId ORDER BY StockId ) FROM Stock AS S WITH(TABLOCKX) WHERE Trancode = 'IN' UNION ALL SELECT StockID, ArticleID, TranCode, Items, Price, RN2 = ROW_NUMBER() OVER ( PARTITION BY ArticleId ORDER BY StockId ) FROM Stock AS S WITH(TABLOCKX) WHERE Trancode = 'RET' ) AS A ), SummaryStock AS ( SELECT ArticleId, RN = CASE Trancode WHEN 'IN' THEN RN2 ELSE RN - RN2 END, Items = SUM(Items), Price = SUM(ISNULL(Price, 0)), OutItems = ISNULL(( SELECT SUM(Items) FROM Stock WHERE ArticleId = N.ArticleId AND TranCode = 'OUT' ),0) FROM NumberedStock AS N GROUP BY ArticleId, CASE Trancode WHEN 'IN' THEN RN2 ELSE RN - RN2 END ) INSERT INTO #Summary SELECT * FROM SummaryStock

DECLARE @Items int = 0 DECLARE @ArticleId int = 0 DECLARE @OutItems int = 0

UPDATE #Summary SET @Items = CASE WHEN @ArticleId = ArticleId THEN @Items + Items ELSE Items END, @ArticleId = ArticleId, OutItems = CASE WHEN OutItems <= @Items THEN CASE WHEN @Items - OutItems > Items THEN Items ELSE @Items - OutItems END ELSE 0 END

SELECT ArticleId, CurrentItems = SUM(OutItems), CurrentValue = SUM(OutItems * Price) FROM #Summary GROUP BY ArticleId
more ▼

answered Nov 06 '09 at 07:34 PM

Gianluca Sartori gravatar image

Gianluca Sartori
228 1 4

This runs in 4.5 seconds, so I rounded it up to 5 seconds. We'll know better when Phil returns.
Nov 07 '09 at 03:42 AM Peso
He has returned!! I'm just pasting it into the harness...
Nov 10 '09 at 04:58 PM Phil Factor
(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

New code box

There's a new way to format code on the site - the red speech bubble logo will automatically format T-SQL for you. The original code box is still there for XML, etc. More details here.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.

Topics:

x977
x341
x8
x7
x5

asked: Oct 23 '09 at 03:56 PM

Seen: 27090 times

Last Updated: Nov 14 '09 at 01:55 PM