x

The ‘FIFO Stock Inventory’ SQL Problem

Phil Factor SQL Speed Phreak Competition: No 2

alt text

This competition is now over, but the winner, Dave, got an Amazon Voucher for $60, and the privilege of being able to display the 'Phil Factor SQL Speed Phreak' award on their own site

It was quite a struggle with some close competition from many of those who participated in this competition. However, Dave came up with a clever solution that produced an FIFO calculation from a million rows in a just a few seconds. The third Phil Factor Speed Phreak competition will soon be held on here. Watch out!


(here is the original preamble.)

This competition is to calculate current items in stock and current stock value in FIFO order.
I'll tell you the business rules for the algorithm and provide a sample cursor-based routine that generates the correct result, but slowly (about 40 minutes).

I have seen many different algorithms to do this and most of them involve a cursor. Can it be done more quickly without, or even with, a cursor? In other words, what is the fastest way in SQL Server (any version) to provide this stock inventory report?
It is a reasonable request. We have a stock transaction list (1,000,001 records, 17.6MB zipped) with 15,002 unique articles and we need to do a stock inventory report that gives the current breakdown of the ArticleID, the number of items in stock and the current stock value according to FIFO rules. The list should be in ArticleID order. The sample stock transaction list will include two running totals so you can check the results of your routine. They are NOT part of the competition, and you are not allowed to use them in your calculation.

Beware that the sample file includes the two extra columns; CurrentItems (INT) and CurrentValue (MONEY) !
If you want a smaller subset to test on, import all records and remove all articles but two or three. These are the samples for which all suggestions will be measured with.

The table is in this form (simplified from the way we'd do it in a real system of course).
In this made-up but realistic problem, you have an existing table and database design which you can’t do anything about. (we can all relate to that!) You have an existing cursor-based solution that is taking several minutes to run. (Yes, we all relate to that). The database is performing badly, and you need to take effective steps to remedy this. The only thing you can do is to come up with a better-performing routine. Redesigning the database isn’t an option very often, in the real world, because this requires team sign-in. This is a competition based on the real, sometimes imperfect, world: not an exposition of database design. The point is that we are faced with designs like this and we have to heal them. The competition is all about making stuff go faster.

CREATE TABLE    dbo.Stock
                (
                    StockID INT IDENTITY(1, 1) NOT NULL,
                    ArticleID SMALLINT NOT NULL,
                    TranDate DATETIME NOT NULL,
                    TranCode VARCHAR(3) NOT NULL,
                    Items INT NOT NULL,
                    Price MONEY NULL,
                    CONSTRAINT [PK_Stock] PRIMARY KEY CLUSTERED 
                    (
                        StockID ASC
                    )
                )  

CREATE NONCLUSTERED INDEX IX_Input ON dbo.Stock (TranCode, ArticleID)
--INCLUDE (TranDate, Items, Price) -- Remove comment for SQL Server 2005 and later
--WHERE TranCode IN ('IN', 'RET')   -- Remove comment for SQL Server 2008

CREATE NONCLUSTERED INDEX IX_Output ON dbo.Stock (TranCode, ArticleID)
-- INCLUDE (TranDate, Items) -- Remove comment for SQL Server 2005 and later
--WHERE TranCode = 'OUT'  -- Remove comment for SQL Server 2008

You are welcome to change the two nonclustered indexes to suit your solution. You can download the complete sample data here. I have an idea of my own of the way to do this but I don’t know if it is the fastest.

Explanation of FIFO rules (example, abbreviated)

StockID ArticleID TranDate TranCode Items  Price CurrentItems CurrentValue
   4567     10000 10:45:07 IN         738 245.94          738   181,503.72
  21628     10000 12:05:25 OUT        600                 138    33,939.72
  22571     10000 14:39:27 IN          62 199.95          200    46,336.62
  30263     10000 16:14:13 OUT        165                  35     6,998.25
  42090     10000 18:18:58 RET          5                  40     7,998.00
  58143     10000 20:18:54 IN         500 135.91          540    75,953.00

a) First IN add 738 items (each $245.94) to the stock, for a total of $181,503.72
b) Then we take out 600 items (each 245.94) from the stock, leaving a total of $33,939.72
c) Then we insert 62 items (each 199.95) to the stock, for a total of $46,336.62
d) Then we take out 165 items (138 each 245.94 and 27 each 199.95), leaving a total of $6,998.25
e) Then we return 5 items. We can’t track at which price we took them out; so all returns are priced at the price of the latest ones inserted before the return. Even if there should be items left for the price of 245.94, the returned items are valued for 199.95. After the return, the current stock value is $7,998.00
f) The final insert adds $67,995.00 to the stock value, for a total of $75,953.00

As mentioned before, the CurrentItems and CurrentValue columns in the sample data are only included for you to validate your routines.

Here are some guidelines for your entries:

1) Include a header in your suggestion. Make sure your name and date is present.
2) Include an edition number. First edition is 1. If you later improve your current suggestion post it again as version 2. Example: “Peso 1” and if improved, “Peso 1b”, “Peso 1c” etc.
3) If you are trying a new algorithm, change the edition to “Peso 2”. If you improve this algorithm, change the version to “Peso 2b”, “Peso 2c” etc. This will save Phil hours of work in the test harness!
4) If you create a temp table, make sure you delete it in the script.
5) Keep the order of columns in output as ArticleID, CurrentItems, CurrentValue

I will allow you to use an existing tally number table (make sure it starts with 0). You can use any kind of object for this competition, except SQLCLR. If you are using a fixed tally number table, it has to be named dbo.TallyNumbers and the column named Number.
The time taken for their creation will not be included in the timings. The time measured is the “Main” script/procedure. If you want to call sub-procedures, go ahead.

The winner will be amongst the tied fastest entrants (generally there is a group of these) and it will be the one with the highest number of votes. We'll announce the winner in three week's time on 16th November.

For a starter, here is a common cursor based solution that you will find in production in many places.

CREATE TABLE    #Work
                (
                    RowID INT IDENTITY(1, 1) PRIMARY KEY CLUSTERED,
                    Price MONEY
                )  

DECLARE @ArticleID INT = 1,
        @PrevID INT = 0,
        @TranCode VARCHAR(3),
        @Items INT,
        @Price MONEY,
        @Loop INT = 0,
        @StockID INT,
        @LatestPrice MONEY,
        @Total INT = (SELECT COUNT(*) FROM dbo.Stock)

DECLARE curYak CURSOR FORWARD_ONLY FOR
                SELECT      ArticleID,
                            TranCode,
                            Items,
                            Price,
                            StockID
                FROM        dbo.Stock
                ORDER BY    ArticleID,
                            TranDate

OPEN    curYak

FETCH   NEXT
FROM    curYak
INTO    @ArticleID,
        @TranCode,
        @Items,
        @Price,
        @StockID

WHILE @@FETCH_STATUS = 0
    BEGIN
        IF @ArticleID > @PrevID
            BEGIN
                TRUNCATE TABLE  #Work

                SET @LatestPrice = NULL
            END

        IF @TranCode = 'IN'
            BEGIN
                INSERT  #Work
                        (
                            Price
                        )
                SELECT  @Price
                FROM    dbo.TallyNumbers
                WHERE   Number < @Items

                SET     @LatestPrice = @Price
            END

        IF @TranCode = 'RET'
            INSERT  #Work
                    (
                        Price
                    )
            SELECT  @LatestPrice
            FROM    dbo.TallyNumbers
            WHERE   Number < @Items

        IF @TranCode = 'OUT'
            DELETE  w
            FROM    (
                        SELECT      TOP(@Items)
                                    RowID
                        FROM        #Work
                        ORDER BY    RowID
                    ) AS w

        UPDATE      s
        SET         s.CurrentItems = w.CurrentItems,
                    s.CurrentValue = COALESCE(w.CurrentValue, 0)
        FROM        dbo.Stock AS s
        INNER JOIN  (
                        SELECT  COUNT(*) AS CurrentItems,
                                SUM(Price) AS CurrentValue
                        FROM    #Work
                    ) AS w ON s.StockID = @StockID

        SELECT  @PrevID = @ArticleID,
                @Loop += 1

        IF @Loop % 1000 = 0
            RAISERROR('Now updating record %d of %d.', 10, 1, @Loop, @Total) WITH NOWAIT

        FETCH   NEXT
        FROM    curYak
        INTO    @ArticleID,
                @TranCode,
                @Items,
                @Price,
                @StockID
    END

DROP TABLE  #Work

CLOSE       curYak
DEALLOCATE  curYak

The above is already done. Here is the code to produce the final resultset.

SELECT      ArticleID,
            CurrentItems,
            CurrentValue
FROM        (
                SELECT  ArticleID,
                        CurrentItems,
                        CurrentValue,
                        ROW_NUMBER() OVER (PARTITION BY ArticleID ORDER BY TranDate DESC) AS recID
                FROM    dbo.Stock
            ) AS d
WHERE       recID = 1
ORDER BY    ArticleID

Good luck to you all!

Peter Larsson

more ▼

asked Oct 23, 2009 at 03:56 PM in Default

Peso gravatar image

Peso
1.6k 5 6 8

Was there any particular reason you disallowed CLR functionality?

On an initial look, this would be the ideal scenario in which to use a CLR type...
Oct 23, 2009 at 04:34 PM Peso
Two things had us made this decision: 1) We have tried CLR before and there was no performance gain. 2) To test and verify, poster need to disclode the source code for the suggestion. But if you have a CLR routine that runs on 15 seconds or less, I think Phil will be interested in testing it.
Oct 23, 2009 at 05:08 PM Peso
Questions: 1) Is the Stock Transaction List referred to the one that will be used for Final Evaluation? If not, can you describe the differences? 2) What is the configuration that entries will be (finally) evaluated on?
Oct 23, 2009 at 05:09 PM RBarryYoung
Yes, the downloadable file is the data against all suggestions will be measured. Phil will drop or rename the two last columns so that no suggestion can use them. I guess Phil will be using same test harness as last time with 'Subscription List' competition.
Oct 23, 2009 at 05:16 PM Peso
Those filtered indexes will only work on SQL Server 2008, right?
Oct 23, 2009 at 05:46 PM RBarryYoung
(comments are locked)
10|1200 characters needed characters left

28 answers: sort oldest

Scot Hauder v1, 2 Nov 2009

I modified one index as shown below

CREATE NONCLUSTERED INDEX IX_Input ON dbo.Stock (TranCode, ArticleID, trandate desc) INCLUDE (Items, Price) WHERE TranCode IN ('IN', 'RET')

;WITH voodoomagick AS ( SELECT ROW_NUMBER() OVER (PARTITION BY ArticleID ORDER BY TranDate) [r] ,ArticleID ,Items ,LastPrice ,(SELECT COALESCE(SUM(items),0) FROM Stock s0 WITH (NOLOCK) WHERE TranCode IN ('RET','IN') AND s0.TranDate <= s1.TranDate
AND s0.ArticleID = s1.ArticleID) [in]

,(SELECT COALESCE(SUM(Items),0) FROM Stock s0 WITH (NOLOCK) WHERE TranCode = 'OUT' AND s0.ArticleID=s1.ArticleID) [out]

FROM Stock s1

CROSS APPLY (SELECT TOP (1) Price [LastPrice] FROM Stock s3 WITH (NOLOCK) WHERE s3.TranCode = 'IN'

AND s3.ArticleID = s1.ArticleID 
AND s3.TranDate &lt;= s1.TranDate 
ORDER BY TranDate DESC) [x1]

WHERE (SELECT COALESCE(SUM(Items),0) FROM Stock s0 WITH (NOLOCK) WHERE TranCode IN ('IN','RET') AND s0.TranDate <= s1.TranDate AND s0.ArticleID = s1.ArticleID)

&gt;=(SELECT COALESCE(SUM(Items),0) 
    FROM Stock s0 WITH (NOLOCK) 
    WHERE TranCode = 'OUT' 
    AND s0.ArticleID = s1.ArticleID)
AND TranCode IN ('IN','RET') ) SELECT ArticleID ,MAX([in])-MAX([out]) [CurrentItems] ,SUM(CASE WHEN r = 1 THEN [in]-[out] ELSE Items END * LastPrice ) [CurrentValue] FROM voodoomagick WITH (NOLOCK) GROUP BY ArticleID ORDER BY ArticleID
more ▼

answered Nov 03, 2009 at 02:26 AM

ScotHauder_ gravatar image

ScotHauder_
1

Hi Scot! Great to see you enter the competition. I will test this suggestion when I have clean access to preliminary test machine.
Nov 03, 2009 at 06:00 AM Peso
My index seek still has a lot of executions compared to Dave's entry. If I have time this weekend I will see if I can improve on it
Nov 03, 2009 at 03:27 PM ScotHauder_
You have time. Deadline is set to November 16.
Nov 03, 2009 at 04:54 PM Peso
(comments are locked)
10|1200 characters needed characters left

I can't find the Dave 3.a version in answer list.

more ▼

answered Nov 04, 2009 at 08:41 PM

Mladen gravatar image

Mladen
1

3.a has been superseded , 3.d is now the latest
Nov 11, 2009 at 09:23 AM dave ballantyne
(comments are locked)
10|1200 characters needed characters left

I had some help with this one from a friend (thanks Mark!). it's just at a second on my machine. Let me know if I did anything wrong

/* Matt Clingan 11/5/2009 Phil Factor Challenge Entry Edition 1 */

--=-=-=-=-=-=-=-=-=-=-=-=-=-=-= --cleanup --=--=-=-=-=-=-=-=-=-=-=-=-=-=-

IF not EXISTS (SELECT 1 FROM sys.indexes WHERE object_id = OBJECT_ID(N'[dbo].[Stock]') AND name = 'ix_StockWork_1') CREATE INDEX ix_StockWork_1 on stock(ArticleID, stockid desc,trancode,price,items) IF exists (select 1 from tempdb.sys.tables where name like '%#tt1%') DROP TABLE #tt1 IF exists (select 1 from tempdb.sys.tables where name like '%#tTemp%') DROP TABLE #tTemp IF exists (select 1 from tempdb.sys.tables where name like '%#fifofinal%') DROP TABLE #FIFOFINAL IF exists (select 1 from tempdb.sys.tables where name like '%#f%') DROP TABLE #f

--=-=-=-=-=-=-=-=-=-=-=-=-=-=-= --temp tables --=--=-=-=-=-=-=-=-=-=-=-=-=-=-

create table #tt1 (articleid smallint,stockid int) create table #tTemp(articleid smallint,stockid int, items int,price money) create table #f(articleid smallint, finalitems int) create table #fifoFinal (articleid smallint, stockid int, itemcount int, finalitems int, finalDollars money)

insert into #tt1 select articleid,MAX(stockid) as stockid from stock where trancode='IN'
group by articleid

insert into #f select articleid,SUM(case when trancode = 'OUT' then -items else items end) as finalitems from stock group by articleid

insert into #fifofinal select #f.articleid ,#tt1.stockid ,finalitems-sum(items) ,finalitems, case when sum(items)>finalitems then finalitems else sum(items) end * max(price) from #f left join #tt1 on #f.articleid = #tt1.articleid join stock s on #tt1.articleid = s.articleid and #tt1.stockid <=s.stockid and s.trancode in ('IN','RET') group by #f.articleid,#tt1.stockid,finalitems

select * from #fifofinal
more ▼

answered Nov 05, 2009 at 01:08 PM

Matt 2 gravatar image

Matt 2
1 1

Impressive! I get wrong results from this for some rows, but I don't think it's going to be difficult to fix. Impressive.
Nov 05, 2009 at 02:32 PM Gianluca Sartori
See for example ArticleID 24986.
Nov 05, 2009 at 03:43 PM Peso

And change last SELECT statement to

select articleid, finalitems, finaldollars from #fifofinal order by articleid

Also change all cleanup to last statement so that I can run this 10 consecutive times. No nees to drop and recreate index.
Nov 05, 2009 at 03:44 PM Peso
I have investigated the query, and the culprit is "end * max(s.price)". The price is not the max price, it is the latest price for an IN transaction. The max price may be the earliest price if price value is falling.
Nov 05, 2009 at 04:01 PM Peso
(comments are locked)
10|1200 characters needed characters left

I have a question regarding the rule for Returns. As follows:

You explain that because we can't track which original sales were being returned, therefore we always price returns at the price of the IN immediately preceding the return. However, this does not make sense in the case when there were no OUTS after the IN which immediately precedes the RET. In such a case, because we know that the RET could not have been of the items that just came IN. The rule should be "RET is priced at the price of the IN immediately preceding the OUT immediately preceding the RET."

This way RETs can be consistently ignored in the calculations, because only real INs count. And it is consistent with FIFO.

please advise

more ▼

answered Nov 06, 2009 at 03:03 AM

Moshe gravatar image

Moshe
2

It does make sense for the economists. They want their stock to be as accurate values as possible. If we encounter a customer return ("RET"), we want the stock to be updated with the latest known value the the article used in the return. And since we cannot track exact order from which the article was taken out at, we use the latest known price available. For example, if price is increasing, we want the stock to be valued at the higher (latest) price, not the price originally assigned, and if the price is decreasing, we want the same thing, the latest known price for that article.
Nov 06, 2009 at 04:31 AM Peso
Adn regarding your OUT statement. The OUTtake can span several IN transactions. Which price do you suggest we use, if an OUT drains 5,6 or more IN's?
Nov 06, 2009 at 04:32 AM Peso
(comments are locked)
10|1200 characters needed characters left

Even though under FIFO, current value is tracked at the more recent purchase prices, the value of goods already in inventory is not adjusted up or down just because another purchase is made at a different price. The premise of FIFO is not to track the value of current inventory at the latest, or most accurate basis. This is not market based cost tracking, or LCM. It's one of two accounting constructs designed for an environment where a) inventory value is tracked at cost, and b) subtractions from inventory cannot be tracked back to the exact point and price of entry. In such an environment, the arbitrary, yet consistent, assumption that First in First Out provides a method to consistently track value in an inherently inconsistent and unknown environment.

The problem with always using the latest IN price for returns is that in cases where no OUTS occurred after the latest IN, or where the number of items returned exceeds the number of OUT items that sold after the latest IN, the practice violates condition a) above - that value is tracked at cost.

Since RETs can only happen if there are OUTS, it makes sense to treat RETs as cancellations of the OUTS. And just like OUTS can span several INs (that's a premise of the rule), RETs can span, or cancel, several OUTs.

I'm not hung up over it, I can see how treating a RET as another purchase at the most recent price is consistent as well, but that complicates the query:)

more ▼

answered Nov 06, 2009 at 05:29 AM

Moshe gravatar image

Moshe
2

(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

New code box

There's a new way to format code on the site - the red speech bubble logo will automatically format T-SQL for you. The original code box is still there for XML, etc. More details here.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.

Topics:

x990
x346
x8
x7
x5

asked: Oct 23, 2009 at 03:56 PM

Seen: 29479 times

Last Updated: Nov 14, 2009 at 01:55 PM