question

Steve Jones - Editor avatar image
Steve Jones - Editor asked

What's the best way to solve the FizzBuzz question?

Since my editorial (http://www.sqlservercentral.com/articles/Editorial/69646/) turned into a performance question, I thought I'd ask what the best way to solve this question is:

Write code that will produce a list of sequential numbers (count unknown). For each multiple of:

  • 3, replace 3 with Fizz
  • 5, replace 5 with Buzz
  • 3 and 5, replace with FizzBuzz

You can choose any method, and we are looking for the best code, in terms of scale and performance, with minimal resource loads.

If there are limitations to your code, please list them. Use comments to note issues with code.

The highest voted item on Mar 31 gets $25 to Amazon.

t-sqlperformance
8 comments
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

do we need to make this a community wiki?
0 Likes 0 ·
@Kev - yes, I would say so...
0 Likes 0 ·
er...do I have to change each answer too? Not quite sure about this community wiki thang....
0 Likes 0 ·
Is the challenge to "write a list to a table" or to return a list to the client?
0 Likes 0 ·
Show more comments
Fatherjack avatar image
Fatherjack answered

Not sure about this but 100,000 rows appears in approx 1s, 1M rows takes about 12s...

change the @int value to a number you want to reach.

DECLARE @int AS INT

SET @int = 500000
SELECT TOP ( @int )
        CASE WHEN ROW_NUMBER() OVER ( ORDER BY c1.column_id ) % 3 = 0
             THEN CASE WHEN ROW_NUMBER() OVER ( ORDER BY c1.column_id ) % 5 = 0
                       THEN 'FizzBuzz'
                       ELSE 'Fizz'
                  END
             WHEN ROW_NUMBER() OVER ( ORDER BY c1.column_id ) % 5 = 0
             THEN 'Buzz'
             ELSE CAST(ROW_NUMBER() OVER ( ORDER BY c1.column_id ) AS NVARCHAR(8))
        END AS FizzBuzzColumn
FROM    master.sys.all_columns c1
        CROSS JOIN master.sys.all_columns c2

Having seen the way Kev does it. My results are:

(1000000 row(s) affected) Table 'syscolrdb'. Scan count 2, logical reads 196, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'syscolpars'. Scan count 2, logical reads 18, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times: CPU time = 922 ms, elapsed time = 12858 ms.

3 comments
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

hey I have no idea what the format is?!?! I think I was seconds after you in posting, amazingly we have had very similar first thoughts! Although your CROSS JOIN annihilates my cartesian product.
0 Likes 0 ·
ahh it was my choice of base table....
0 Likes 0 ·
not checked for properties but the indexing of the tables would assist wouldnt it?
0 Likes 0 ·
Kev Riley avatar image
Kev Riley answered

OK I'll start the ball rolling

--
-- KevRiley v1
--
declare @maxnum int
set @maxnum = 1000000

set statistics io on
set statistics time on


select top (@maxnum)
    case 
        when row_number()over(order by sc1.object_id)%15=0 then 'FizzBuzz'
        when row_number()over(order by sc1.object_id)%5=0 then 'Buzz'
        when row_number()over(order by sc1.object_id)%3=0 then 'Fizz'
        else cast(row_number()over(order by sc1.object_id) as varchar)
    end
from sys.columns sc1 , sys.columns sc2, sys.columns sc3

set statistics io off
set statistics time off

on 1,000,000 rows, it's pretty bad....

(1000000 row(s) affected)
Table 'syscolpars'. Scan count 2043, logical reads 10209, physical reads 0, 
 read-ahead reads 0, lob logical reads 0, lob physical reads 0, 
 lob read-ahead reads 0.

SQL Server Execution Times:
   CPU time = 3578 ms,  elapsed time = 20105 ms.

SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

I guess should ignore the timings - this is on a VM, individual mileage will vary.

2 comments
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Just out of curiosity, why use a VM? (I know of lots of reasons you might want to, but I'm curious as to what actually motivates people in that direction)
0 Likes 0 ·
It's my dev database server - would never use vm in production though
0 Likes 0 ·
TimothyAWiseman avatar image
TimothyAWiseman answered

FatherJack's answer is very good in T-SQL. But I think the best way to solve this is to not use SQL. SQL is fantastic for somethings, but it does not make a whole lot of sense to use it here. If I wanted pure performance, I would probably use C (though that would mean I would have to go and relearn C...)

For readability and quick coding, I would use Python.

max = 5000
for x in range(1, max):
    if x%3 ==0 and x%5 ==0:
        print 'fizzbuzz'
    elif x%3 ==0:
        print 'fizz'
    elif x%5 ==0:
        print 'buzz'
    else:
        print x

And I could check for mod 15 instead of checking for mod 3 and mod 5, but this way explicitly matches the question.

5 comments
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Only problem, this isn't a T-SQL solution.
0 Likes 0 ·
Yes, that is my point. This is not a good problem to address in T-SQL under normal circumstances. It can be done (in fact I am highly impressed by some of the solutions here!) but it does not fit well. This is much better in a more procedural language. C is probably fastest in terms of execution, Python is probably fastest in terms of development, so I would use one of those 2 unless there was some overriding reason to use T-SQL. If I was asked this in an interview, I could provide a T-SQL answer, but I would do so while suggesting C or Python as better solutions.
0 Likes 0 ·
I think you may be missing the point of the exercise. This allows someone to see an individuals thought process and how they may solve a problem using T-SQL. If the individual uses a cursor or while loop, that may be how they will approach other activities they may be asked to accomplish if hired.
0 Likes 0 ·
I see where you are coming from, and you have a good point. But I think knowing the right tool for the job is also part of being a good developper. I do not think T-SQL is the best tool for this one. If needed, it can be done, and done in a set based way no less, (and again, some of the answers here are very impressive), and if asked for a T-SQL solution in an interview I could provide one (set based), but I would do so while pointing out better tools are available.
0 Likes 0 ·
Using the "right tool" may simply not be an option or it may actually not be the right tool. Please see my answer below.
0 Likes 0 ·
Kev Riley avatar image
Kev Riley answered

Utilising the recursive CTE method as (sometimes) accredited to Itzik, here's a solution that has zero I/O. Timings on my cruddy dev server are about the same

--
-- KevRiley v2
--
declare @maxnum int
set @maxnum = 1000000

set statistics io on
set statistics time on

;WITH Nbrs_4( n ) AS ( SELECT 1 UNION SELECT 0 ),
      Nbrs_3( n ) AS ( SELECT 1 FROM Nbrs_4 n1 CROSS JOIN Nbrs_4 n2 ),
      Nbrs_2( n ) AS ( SELECT 1 FROM Nbrs_3 n1 CROSS JOIN Nbrs_3 n2 ),
      Nbrs_1( n ) AS ( SELECT 1 FROM Nbrs_2 n1 CROSS JOIN Nbrs_2 n2 ),
      Nbrs_0( n ) AS ( SELECT 1 FROM Nbrs_1 n1 CROSS JOIN Nbrs_1 n2 ),
      Nbrs  ( n ) AS ( SELECT 1 FROM Nbrs_0 n1 CROSS JOIN Nbrs_0 n2 )
SELECT 
    case 
        when n%15=0 then 'FizzBuzz'
        when n%5=0 then 'Buzz'
        when n%3=0 then 'Fizz'
        else cast(n as varchar)
    end
      FROM ( SELECT ROW_NUMBER() OVER (ORDER BY n)
               FROM Nbrs ) D ( n )
     WHERE n <= @maxnum;

set statistics io off
set statistics time off
1 comment
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Old post, I know but, just to be sure, that isn't a recursive CTE. That's a Cascading CTE. Recursive CTEs are slothful, resource consuming beasts that shouldn't be used for such a thing (and you haven't ;-).
0 Likes 0 ·
Lynn Pettis avatar image
Lynn Pettis answered

This code should work on both SQL Server 2005 and SQL Server 2008.

declare @pEndValue bigint,
        @pStartValue bigint,
        @pIncrement bigint;
set @pStartValue = 1;
set @pEndValue = 1000000;
set @pIncrement = 1;

set statistics io on;
set statistics time on;

with BaseNum ( 
    N 
) as ( 
select 1 union all 
select 1 union all 
select 1 union all 
select 1 union all 
select 1 union all 
select 1 union all 
select 1 union all 
select 1 union all 
select 1 union all 
select 1 
), 
L1 ( 
    N 
) as ( 
select 
    bn1.N 
from 
    BaseNum bn1 
    cross join BaseNum bn2 
    cross join BaseNum bn3 
), 
L2 ( 
    N 
) as ( 
select 
    a1.N 
from 
    L1 a1 
    cross join L1 a2 
), 
Tally ( 
    N 
) as ( 
select top (isnull(abs(@pEndValue - @pStartValue) / abs(nullif(@pIncrement,0))+ 1,1)) --Limit N per parameters 
    row_number() over (order by a1.N) 
from 
    L2 a1 
    cross join L2 a2 
) 
select
    N,
    case when N % 3 = 0 and N % 5 = 0 then 'FizzBuzz'
         when N % 3 = 0 then 'Fizz'
         when N % 5 = 0 then 'Buzz'
         else CAST(N as varchar(12))
    end as Value
into
    #TempTable
from
    Tally;

set statistics time off;
set statistics io off;
go
select * from #TempTable order by N;
go

When run under SQL Server 2008 DE on my Home PC:

SQL Server Execution Times: CPU time = 2063 ms, elapsed time = 2083 ms.

(1000000 row(s) affected)

When modified to put the return value in a variable, the run time dropped:

SQL Server Execution Times: CPU time = 1687 ms, elapsed time = 1688 ms.

My computer at home has an Intel P4 3.0 GHz hyperthreaded processor with 2 GB ram.

Modified code to return output in proper order.

Updated Code, made a slight change to my code and tested it with 1,000,000 rows on an x64 Blade server (8 cores) with 32 GB ram. I thought it best to include my previous code as well. Turns out moving the row_count() out of the CTE with the final cross join speeds things up by about 50%. Timings for previous code was around 2 seconds for 1,000,000 records.

declare @pEndValue bigint,
        @pStartValue bigint,
        @pIncrement bigint,
        @OutVar varchar(12);

set @pStartValue = 1;
set @pEndValue = 1000000;
set @pIncrement = 1;

set statistics io on;
set statistics time on;

with BaseNum ( 
    N 
) as ( 
    select 1 union all 
    select 1 union all 
    select 1 union all 
    select 1 union all 
    select 1 union all 
    select 1 union all 
    select 1 union all 
    select 1 union all 
    select 1 union all 
    select 1 
), 
L1 ( 
    N 
) as ( 
    select 
        bn1.N 
    from 
        BaseNum bn1 
        cross join BaseNum bn2 
        cross join BaseNum bn3 
), 
L2 ( 
    N 
) as ( 
select 
    a1.N 
from 
    L1 a1 
    cross join L1 a2 
), 
L3 ( 
    N 
) as ( 
select top (isnull(abs(@pEndValue - @pStartValue) / abs(nullif(@pIncrement,0))+ 1,1)) --Limit N per parameters 
    a1.N
from 
    L1 a1 
    cross join L2 a2 
),
Tally as (
select
    row_number() over (order by a1.N) as N
from
    L3 a1
)
select
    N,
    case when N % 3 = 0 and N % 5 = 0 then 'FizzBuzz'
         when N % 3 = 0 then 'Fizz'
         when N % 5 = 0 then 'Buzz'
         else CAST(N as varchar(12))
    end as Value
into
    #TempTable
from
    Tally;

set statistics time off;
set statistics io off;
go
select * from #TempTable order by N;
go

Statistics writing to #TempTable:

SQL Server Execution Times: CPU time = 1248 ms, elapsed time = 1255 ms.

(1000000 row(s) affected)

6 comments
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Lynn, I tried your latest version on my laptop (SQL2008 on WinXP dual core 2Gb RAM) and dev server (SQL2005 Win2K3 x64 16 cores 16 Gb RAM) and I don't see any difference on the performance side moving ROW_NUMBER inside or outside the final CROSS JOIN. Did you try it on your home computer as well? Does the same performace gain turn up?
0 Likes 0 ·
BTW, now that I tried your script, I see that mine does exactly the same thing. I didn't realize they were so similar, I don't want to steal your thunder!
0 Likes 0 ·
I agree with Gianluca here. I tested the same changes and see no change in time between them.
0 Likes 0 ·
I saw a definite change here at work on the blade server where I tested it. You know, it could be one of those "It Depends" type of things. I'll have to try it at home as well when I get home tonight.
0 Likes 0 ·
Some things are very difficult to find a reason why. Just consider that for some queries my monster production server runs consistently slower than my laptop. Reason? Unknown.
0 Likes 0 ·
Show more comments
CirqueDeSQLeil avatar image
CirqueDeSQLeil answered

Here is the fastest method I have found for the problem. It is like Kev Riley, but has some subtle differences to make it go just that much faster. Execution time is 25% faster on average, and plan cost is lower. These are scripts that I had been working with for a blog series I am working on currently. The Script shown by Kev, is fantastically similar to my first attempt at solving this problem with Itzik's Cascading CTE method.

set statistics io on
set statistics time on

Declare @Limit  BigInt

Set @Limit = 1000000
;
WITH Nbrs_2( n ) AS (select 1 union all
        select 1 union all
        select 1 union all
        select 1 union all
        select 1 union all
        select 1 union all
        select 1 union all
        select 1 union all
        select 1 union all
        select 0),
          Nbrs_1( n ) AS ( SELECT 1 FROM Nbrs_2 n1 CROSS JOIN Nbrs_2 n2 ),
          Nbrs_0( n ) AS ( SELECT 1 FROM Nbrs_1 n1 CROSS JOIN Nbrs_1 n2 ),
          Nbrs  ( n ) AS ( SELECT 1 FROM Nbrs_0 n1 CROSS JOIN Nbrs_0 n2 )
SELECT case
        when n % 15 = 0 then 'Dr. Pepper'   --Multiples of 3 AND 5
        when n % 3 = 0 then 'Pepsi'
        when n % 5 = 0 then 'Coke'
        else cast(n as VarChar(8))
    end as 'FizzBuzz' 
    FROM ( SELECT ROW_NUMBER() OVER (ORDER BY n)
                   FROM Nbrs ) D (n)
         WHERE n <= abs(@Limit) ;

;

set statistics io Off
set statistics time Off

Decided to include the stats I saw.

CPU Time 844ms Elapsed Time 5108ms

Just For fun, I decided to change the script up a little bit. I was wondering, "What if I only want a specific number? And what if I need to be able to support an unspecified range?" Well, that is easily accommodated with the addition of a variable.

set statistics io on
set statistics time on

 Declare @Limit BigInt
        ,@WhatNum BigInt

Select @Limit = 1000000
        ,@WhatNum = 15
;
WITH Nbrs_2( n ) AS (select 1 union all
        select 1 union all
        select 1 union all
        select 1 union all
        select 1 union all
        select 1 union all
        select 1 union all
        select 1 union all
        select 1 union all
        select 0),
    Nbrs_1( n ) AS ( SELECT 1 FROM Nbrs_2 n1 CROSS JOIN Nbrs_2 n2 CROSS JOIN Nbrs_2 n3),
    Nbrs_0( n ) AS ( SELECT 1 FROM Nbrs_1 n1 CROSS JOIN Nbrs_1 n2 ),
    Nbrs  ( n ) AS ( SELECT 1 FROM Nbrs_0 n1 CROSS JOIN Nbrs_0 n2 )
SELECT case
        when n % 15 = 0 then 'Dr. Pepper'   --Multiples of 3 AND 5
        when n % 3 = 0 then 'Pepsi'
        when n % 5 = 0 then 'Coke'
        else cast(n as VarChar(8))
    end as 'FizzBuzz' 
    FROM (SELECT ROW_NUMBER() OVER (ORDER BY n)
                   FROM Nbrs ) D (n)
         WHERE n <= abs(@Limit)
            And (Isnull(@WhatNum,-1) = -1 Or n = @WhatNum)
Go

;
set statistics io Off
set statistics time Off

Execution times are nearly the same as the previous script. Execution plan cost is higher. The 'OR' would be the cause of that. This script now supports retrieving either the list, or just one specific number. Execution time for 1m as the limit drops dramatically when a value is passed to the variable. If the variable is set to null, then the entire result set is returned. One more note: If I remove the abs() from the Where, then performance improves. This holds true for both scripts. I am using the abs() in the event that this were to be a proc and somebody decided to input a negative value.

1 comment
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Testing some more, I moved the select statement into a cte and then selected from there. I see no noticeable difference in time. Additionally, 1.5 seconds can be saved from elapsed time by outputting to text rather than grid. The cpu time improves very slightly though. And if only doing a count from the final cte, then it completes in 170ms.
0 Likes 0 ·
Manie Verster avatar image
Manie Verster answered

In Steve's editorial about FizzBuzz I wrote that I wanted to use a cte i.s.o. a temp table. Well, here is my cte and it tested well against a tally table as well as a temp table that was done the same way as the tally table but I just use a global temp table (##tally). The durations was as follows: Tally - 15036ms; Temp - 15613 and CTE - 11616 and all was done with 1000 000 rows. It is fairly simple coding.

declare @start datetime, @rows int
select @start = GETDATE(), @rows = 1000000
;with cteTally as
(
SELECT top (@rows) row_number() over(order by sc1.[name]) nbr  FROM Master.dbo.SysColumns sc1, Master.dbo.SysColumns sc2
)

select case when nbr % 3 = 0 and nbr %5 = 0 then 'FizzBuzz'
    when nbr % 3 = 0 then 'Fizz'
    when nbr % 5 = 0 then 'Buzz' else convert(varchar(50),nbr) end from cteTally

select DATEDIFF(ms,@start,getdate()) duration_ms
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

user-1229 avatar image
user-1229 answered

Here's a slight variation in the CASE statement:

SELECT 
        fizzbuzz = 
            CASE WHEN N % 3 = 0 THEN 'Fizz' ELSE '' END +
            CASE WHEN N % 5 = 0 THEN 'Buzz' ELSE '' END +
            CASE WHEN (N % 5) > 0 AND (N % 3) > 0 THEN cast (N as varchar) ELSE '' END
  FROM 
        Tally
  WHERE 
        N BETWEEN 1 AND 100        
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Gianluca Sartori avatar image
Gianluca Sartori answered

Another answer based on constant scan:

;WITH numbers AS (
    SELECT 0 AS i 
    UNION ALL 
    SELECT 1
    UNION ALL
    SELECT 2
    UNION ALL
    SELECT 3
    UNION ALL
    SELECT 4
    UNION ALL
    SELECT 5
    UNION ALL
    SELECT 6
    UNION ALL
    SELECT 7
    UNION ALL
    SELECT 8
    UNION ALL
    SELECT 9
),
millionRows AS (
    SELECT n = a.i + b.i * 10 + c.i * 100 + d.i * 1000 + e.i * 10000 + f.i * 100000
    FROM numbers AS a
    CROSS JOIN numbers AS b
    CROSS JOIN numbers AS c
    CROSS JOIN numbers AS d 
    CROSS JOIN numbers AS e
    CROSS JOIN numbers AS f 
) 
SELECT result = 
    CASE 
        WHEN n % 15 = 0 THEN 'FizzBuzz'
        WHEN n % 3 = 0 THEN 'Fizz'
        WHEN n % 5 = 0 THEN 'Buzz'
        ELSE CAST(n AS VARCHAR(7))
    END
FROM millionRows
WHERE N > 0
ORDER BY N

Timings are near to Jason's solution, CPU a bit higher.

2 comments
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

you might add where n>0 order by n. you're getting fizzbuzz in the first row with n=0 then getting lots of buzz and fizzbuzz as the initial rows increment by 10.
0 Likes 0 ·
Thanks Ken, I took your suggestion and edited the script.
0 Likes 0 ·
Gianluca Sartori avatar image
Gianluca Sartori answered

It's amazing (to me, probably you won't find it surprising), but ROW_NUMBER() costs less than multiplying. Here's my second go:

set statistics io on
set statistics time on


if object_id('tempdb..#TempTable') is not null
    drop table #TempTable

;WITH numbers AS (
    SELECT 0 AS i 
    UNION ALL 
    SELECT 1
    UNION ALL
    SELECT 2
    UNION ALL
    SELECT 3
    UNION ALL
    SELECT 4
    UNION ALL
    SELECT 5
    UNION ALL
    SELECT 6
    UNION ALL
    SELECT 7
    UNION ALL
    SELECT 8
    UNION ALL
    SELECT 9
),
thousandRows AS (
    SELECT 1 as n
    FROM numbers AS a
    CROSS JOIN numbers AS b
    CROSS JOIN numbers AS c
) ,
millionRows AS (
    SELECT ROW_NUMBER() OVER (ORDER BY a.n) AS n
    FROM thousandRows AS a
    CROSS JOIN thousandRows AS b
)
SELECT result = 
    CASE 
        WHEN n % 15 = 0 THEN 'FizzBuzz'
        WHEN n % 3 = 0 THEN 'Fizz'
        WHEN n % 5 = 0 THEN 'Buzz'
        ELSE CAST(n AS VARCHAR(7))
    END
into #TempTable
FROM millionRows
2 comments
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Nice script. Good idea making it more readable like that. You should add in something to return the results out from the temptable.
0 Likes 0 ·
Oh, yes Jason, you're right. I didn't add the selection from #TempTable to test how the script performs. Bringing results to SSMS is 90% of the query time, so just comment the INTO #TempTable part to read the results, but leave it there if you want to compare results between different scripts.
0 Likes 0 ·

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.