question

Ian Roke avatar image
Ian Roke asked

What are the pros and cons of using a temporary dates table instead of a permanent one?

Quite often I have a need to generate reports that show the days of a month with counts, sums, averages, etc by day. I also get a lot of requests to show each of the days of the month regardless of the fact there might be any data for that day.

I have gone down the route of creating a table, putting in +/- 100 years of dates or whatever and then done a join but I tried some code from this article that creates a CTE dates table for the values I specify. I have shown the code here...

WITH CTE_DatesTable
AS
(
    SELECT CAST('20090601' as datetime) AS [date]
    UNION ALL
    SELECT DATEADD(dd, 1, [date])
    FROM CTE_DatesTable
    WHERE DATEADD(dd, 1, [date]) <= '20090630'
)
SELECT [date] FROM CTE_DatesTable
OPTION (MAXRECURSION 0);

This uses a recursion to generate but even when I specified a from date of '20000101' and an end date of '20100219' it ran in under a second.

Would this be a better method than holding large tables of dates on the server? Pros/cons please guys.

t-sqlsql-servercterecursion
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Jeff Moden avatar image
Jeff Moden answered

I would never use recursion for such a thing... let's start to look at why... (I've never posted code on this forum before so bear with me if I mess up the formatting)...

/***********************************************************
 Profiler output:
 Duration:   38 ms
      CPU:   16 ms
    Reads: 2924 pages
   Writes:    2 pages
***********************************************************/
WITH CTE_DatesTable
AS
(
    SELECT CAST('20100101' as datetime) AS [date]
    UNION ALL
    SELECT DATEADD(dd, 1, [date])
    FROM CTE_DatesTable
    WHERE DATEADD(dd, 1, [date]) <= '2010-12-31'
)
SELECT [date] FROM CTE_DatesTable
OPTION (MAXRECURSION 0);
go
/***********************************************************
 Profiler output:
 Duration:    0 ms
      CPU:    0 ms
    Reads:    0 pages
   Writes:    0 pages
***********************************************************/
WITH
      E1(N) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL
                SELECT 1 UNION ALL SELECT 1 UNION ALL 
                SELECT 1 UNION ALL SELECT 1 UNION ALL 
                SELECT 1 UNION ALL SELECT 1 UNION ALL
                SELECT 1 UNION ALL SELECT 1),--10
      E2(N) AS (SELECT 1 FROM E1 a, E1 b), --100
      E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10,000
      E8(N) AS (SELECT 1 FROM E4 a, E4 b), --100,000,000
cteTally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY N)
               FROM E8)
 SELECT DATEADD(dd,t.N-1,'2010-01-01') AS Date
   FROM cteTally t
  WHERE N <= DATEDIFF(dd,'2010-01-01','2010-12-31')+1;
go
/***********************************************************
 Profiler output:
 Duration:    0 ms
      CPU:    0 ms
    Reads:    0 pages
   Writes:    3 pages
***********************************************************/
 SELECT Date
   FROM dbo.Calendar
  WHERE Date >= '2010-01-01'
    AND Date  < '2011-01-01'
GO

The profiler output on those 3 runs pretty much tells the story of "pros" and "cons". But there's more to it than just shear speed and resource effeciency. Let's think about this for a minute... these 3 examples are only for 365 rows. What if someone decides they want to put the Tally-on-the-fly code into a user defined function and then use it to calculate against a million row table? They find some "really cool" code like a recursive CTE and put it into their function (whatever it does). The recursive CTE is at least 38 times slower than either of the other two methods and on a million rows and in a function, that's going to make for a very large difference in performance.

My feeling is that there are simply no "pros" to using a recursive CTE for this type of thing even on very small numbers of rows because some poor slob might find the code and use it for something bigger. Yes, there are pros to the cross joined CTE method (second method) but even the Calendar table method just absolutely blows the doors off the Recursive CTE method.

Build a temporary calendar table on the fly using a CTE if you want... the pros to doing that are controllable size and you don't have to fight a DBA who may be "disk stingy". Just don't ever use recursion to do it.

--Jeff Moden

10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Steve Jones - Editor avatar image
Steve Jones - Editor answered

A permanent table means you can join it in there, without calculations to build the dates. I would think there is some performance hit for his.

The other pro that I've seen is that you can "mark" days as business days. It's one thing to know the weekdays v weekends, but knowing accounting periods, holidays, etc. are easy to include/exclude in a join if you have flags.

In terms of cons, you have one table, you have possible maintenance, and you use space.

2 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

TimothyAWiseman avatar image TimothyAWiseman commented ·
Good points. Also, in general the permenant table will be faster, especially if properly indexed. Of course, that speed difference may be very small or mean very little on a machine that is not being stressed.
0 Likes 0 ·
Jeff Moden avatar image Jeff Moden commented ·
That's actually not been my experience with things like on-the-fly Tally tables, Timothy. I've found that they're a pretty close match. Admittedly, I've not tried the same thing with Calendar tables. We'd have to do a test before I say for sure one way or another.
0 Likes 0 ·
Kev Riley avatar image
Kev Riley answered

Permanent calendar tables :

  • could be used by pre-2005 code
  • could be indexed
  • has statistics
  • have additional info about dates that only need to be generated once
  • ensure that all references to it are using the same data

On-the fly method :

  • may alleviate collation issues across databases
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Matt Whitfield avatar image
Matt Whitfield answered

It depends! Personally I tend to use an inline udf to build date ranges - because generally the reports I have written don't limit themselves to days - so I would pass in as parameters both the increment and step type - so if I wanted a result set for a week in 15 minute intervals, no problem. If I wanted a weekly set, again no problem.

However I would look at lynn pettis' article about dynamic tally tables as that uses a much more efficient method of table building using cte's - recursing once per result row is probably not the way to go...

10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.