x

Selecting multiple ranges from a set of dates

The query is thus; I have a set of logs in the database with a datetime indicating when they were processed. Our service, being not very reliable, seems to error a chunk of tasks over a certain amount of time. Now, a week later, I'd like to quickly determine when the service was erroring so we can try to find the log files for those tasks logged at those times and try to reprocess them.

For example if the datetimes were:

  • 26/07/2010 1:00:03
  • 26/07/2010 1:00:04
  • 26/07/2010 1:00:05
  • 26/07/2010 1:00:06
  • 26/07/2010 1:00:07
  • 26/07/2010 1:50:37
  • 26/07/2010 1:50:38
  • 26/07/2010 1:50:39
  • 26/07/2010 1:50:40
  • 26/07/2010 1:50:41
  • 26/07/2010 1:50:42
  • 26/07/2010 1:50:43
  • 26/07/2010 1:50:44
  • 27/07/2010 2:20:37
  • 27/07/2010 2:20:38
  • 27/07/2010 2:20:39
  • 27/07/2010 2:20:40
  • 27/07/2010 2:20:41
  • 27/07/2010 2:20:42
  • 27/07/2010 2:20:43
  • 27/07/2010 2:20:44

I need a query that would return:

  • 26/07/2010 1:00:03 - 26/07/2010 1:00:07
  • 26/07/2010 1:50:37 - 26/07/2010 1:50:44
  • 27/07/2010 2:20:37 - 27/07/2010 2:20:44
So that I can use these datetimes to create a batch script to robocopy the logged files into a local directory to be reprocessed. How hard is this?
more ▼

asked Jul 28, 2010 at 06:35 PM in Default

James Hurburgh gravatar image

James Hurburgh
3 1 1 2

(comments are locked)
10|1200 characters needed characters left

6 answers: sort voted first

I've made the test data a little bit more realistic as it has an error overlapping a second boundary. This solution relies on the data being inserted into the heap in datetime order

  declare @Dt table  (TestDate DateTime, contiguity int)
  insert into @dt (testdate) values

  (convert(datetime,'26/07/2010 1:00:03',103)),
  (convert(datetime,'26/07/2010 1:00:04',103)),
  (convert(datetime,'26/07/2010 1:00:05',103)),
  (convert(datetime,'26/07/2010 1:00:06',103)),
  (convert(datetime,'26/07/2010 1:00:07',103)),
  (convert(datetime,'26/07/2010 1:00:25',103)),
  (convert(datetime,'26/07/2010 1:00:50',103)),
  (convert(datetime,'26/07/2010 1:00:51',103)),
  (convert(datetime,'26/07/2010 1:00:52',103)),
  (convert(datetime,'26/07/2010 1:00:53',103)),
  (convert(datetime,'26/07/2010 1:00:54',103)),
  (convert(datetime,'26/07/2010 1:00:55',103)),
  (convert(datetime,'26/07/2010 1:00:56',103)),
  (convert(datetime,'26/07/2010 1:00:57',103)),
  (convert(datetime,'26/07/2010 1:00:58',103)),
  (convert(datetime,'26/07/2010 1:00:59',103)),
  (convert(datetime,'26/07/2010 1:01:00',103)),
  (convert(datetime,'26/07/2010 1:01:01',103)),
  (convert(datetime,'26/07/2010 1:01:02',103)),
  (convert(datetime,'26/07/2010 1:01:03',103)),
  (convert(datetime,'26/07/2010 1:01:04',103)),
  (convert(datetime,'26/07/2010 1:50:37',103)),
  (convert(datetime,'26/07/2010 1:50:38',103)),
  (convert(datetime,'26/07/2010 1:50:39',103)),
  (convert(datetime,'26/07/2010 1:50:40',103)),
  (convert(datetime,'26/07/2010 1:50:41',103)),
  (convert(datetime,'26/07/2010 1:50:42',103)),
  (convert(datetime,'26/07/2010 1:50:43',103)),
  (convert(datetime,'26/07/2010 1:50:44',103)),
  (convert(datetime,'27/07/2010 2:20:37',103)),
  (convert(datetime,'27/07/2010 2:20:38',103)),
  (convert(datetime,'27/07/2010 2:20:39',103)),
  (convert(datetime,'27/07/2010 2:20:40',103)),
  (convert(datetime,'27/07/2010 2:20:41',103)),
  (convert(datetime,'27/07/2010 2:20:42',103)),
  (convert(datetime,'27/07/2010 2:20:43',103)),
  (convert(datetime,'27/07/2010 2:20:44',103))

  DECLARE @LastDate DATETIME ,
    @Contiguity INT
  SELECT  @LastDate = '1 Jan 2010' ,
          @Contiguity = 0
  UPDATE  @dt
  SET     @Contiguity = contiguity = CASE WHEN DATEDIFF(second, @LastDate,
                                                        TestDate) <= 1
                                          THEN @Contiguity
                                          ELSE @Contiguity + 1
                                     END ,
          @LastDate = TestDate

  SELECT  MIN(TestDate) ,
          MAX(TestDate)
  FROM    @dt
  GROUP BY contiguity
  HAVING  COUNT(*) > 1

  /*
  ----------------------- -----------------------
  2010-07-26 01:00:03.000 2010-07-26 01:00:07.000
  2010-07-26 01:00:50.000 2010-07-26 01:01:04.000
  2010-07-26 01:50:37.000 2010-07-26 01:50:44.000
  2010-07-27 02:20:37.000 2010-07-27 02:20:44.000

  (4 row(s) affected)
*/
more ▼

answered Jul 29, 2010 at 10:59 AM

Phil Factor gravatar image

Phil Factor
3.8k 8 9 16

+1 This is perfect. I commented on Cyborg's solution that it only works by accident. This is a rather impressive quirky update, @Contiguity's "before" is read, and "after" is set to become read "before" on the next step...
Jul 29, 2010 at 11:23 AM Oleg

If you don't like quirky update, you can do it with a number table but it gets kinda complicated. Yes, the original test data was flawed, making the solution far too easy: and wrong!

In real life, you want to see where errors/events get bunched and you want to specify what makes a bunch. This solution allows that.
Jul 29, 2010 at 11:37 AM Phil Factor
@Phil Factor I like quirky update, really do. So far, it only failed me (for a moment) one time when I used the technique to concatenate values from a rather large set, but is was still easy to figure out from the execution plan that the failure was caused by the sort operation taking place after, not (desired) before, so adding the "cheesy" top [big_number] ... order by ... in the subselect cured the problem.
Jul 29, 2010 at 11:50 AM Oleg
What would it take to handle where the "bunch" is a single entry as in '26/07/2010 1:00:25' which was not picked up in your result set?
Jul 29, 2010 at 12:53 PM Blackhawk-17
@Blackhawk-17 Just remove the HAVING COUNT(*) > 1, this will include it with same min and max :)
Jul 29, 2010 at 01:43 PM Oleg
(comments are locked)
10|1200 characters needed characters left
SELECT MIN(TestDate),MAX(TestDate)
FROM Dt
GROUP BY CONVERT(VARCHAR,TestDate,101),DATEPART(HH,TEstDate),DATEPART(Mi,TestDate)
Remarks: Dt Is the Table Name and TestDate is the column Name
more ▼

answered Jul 28, 2010 at 08:58 PM

Cyborg gravatar image

Cyborg
10.6k 36 40 45

I hope you don't mind that I added code tags.
Jul 28, 2010 at 09:21 PM TimothyAWiseman
Perfect, worked like a charm. Thanks!
Jul 28, 2010 at 09:23 PM James Hurburgh
@TimothyAWiseman Good work! every thing is coming in same line when i am giving the answer, thats why i given an extra new line with it.
Jul 28, 2010 at 09:26 PM Cyborg
@James Hurburgh @Cyborg This only works by accident; however, if your data is always like this then I guess it is fine. What I mean is that if you have a process which starts at the beginning of the minute, breaks (so you have a hole in the data) then picks up again at the end of the minute, the logic as it is in the answer will hide the problem by selecting 01 as minimum and 59 as maximum. Additionally, if the process starts at the end of the minute and continues to the next minute, you will get an extra record in the results again not matching your desired output. The accurate solution could require a left join of the tally (with records resembling your table without the holes) and your table. This would allow you to create visible islands of the missing data and go from there. Just my 2 cents.
Jul 29, 2010 at 08:35 AM Oleg

@Cyborg - formatting code problems? You need a blank line between your text and your code, and to indent the code by at least 4 spaces or a tab.

It even works in comments. ;-)
Jul 29, 2010 at 08:47 AM ThomasRushton ♦
(comments are locked)
10|1200 characters needed characters left

OK,

Here goes a better attempt without a quirky update...

Again using @Phil Factor's test data:

  declare @Dt table  (TestDate DateTime)
  insert into @dt (testdate) values

  (convert(datetime,'26/07/2010 1:00:03',103)),
  (convert(datetime,'26/07/2010 1:00:04',103)),
  (convert(datetime,'26/07/2010 1:00:05',103)),
  (convert(datetime,'26/07/2010 1:00:06',103)),
  (convert(datetime,'26/07/2010 1:00:07',103)),
  (convert(datetime,'26/07/2010 1:00:25',103)),
  (convert(datetime,'26/07/2010 1:00:50',103)),
  (convert(datetime,'26/07/2010 1:00:51',103)),
  (convert(datetime,'26/07/2010 1:00:52',103)),
  (convert(datetime,'26/07/2010 1:00:53',103)),
  (convert(datetime,'26/07/2010 1:00:54',103)),
  (convert(datetime,'26/07/2010 1:00:55',103)),
  (convert(datetime,'26/07/2010 1:00:56',103)),
  (convert(datetime,'26/07/2010 1:00:57',103)),
  (convert(datetime,'26/07/2010 1:00:58',103)),
  (convert(datetime,'26/07/2010 1:00:59',103)),
  (convert(datetime,'26/07/2010 1:01:00',103)),
  (convert(datetime,'26/07/2010 1:01:01',103)),
  (convert(datetime,'26/07/2010 1:01:02',103)),
  (convert(datetime,'26/07/2010 1:01:03',103)),
  (convert(datetime,'26/07/2010 1:01:04',103)),
  (convert(datetime,'26/07/2010 1:50:37',103)),
  (convert(datetime,'26/07/2010 1:50:38',103)),
  (convert(datetime,'26/07/2010 1:50:39',103)),
  (convert(datetime,'26/07/2010 1:50:40',103)),
  (convert(datetime,'26/07/2010 1:50:41',103)),
  (convert(datetime,'26/07/2010 1:50:42',103)),
  (convert(datetime,'26/07/2010 1:50:43',103)),
  (convert(datetime,'26/07/2010 1:50:44',103)),
  (convert(datetime,'27/07/2010 2:20:37',103)),
  (convert(datetime,'27/07/2010 2:20:38',103)),
  (convert(datetime,'27/07/2010 2:20:39',103)),
  (convert(datetime,'27/07/2010 2:20:40',103)),
  (convert(datetime,'27/07/2010 2:20:41',103)),
  (convert(datetime,'27/07/2010 2:20:42',103)),
  (convert(datetime,'27/07/2010 2:20:43',103)),
  (convert(datetime,'27/07/2010 2:20:44',103))


SELECT MIN(TestDate) AS [Start], MAX(TestDate) as [End]
FROM (
    SELECT TestDate, 
           (DATEADD(second, - ROW_NUMBER() OVER(ORDER BY TestDate), TestDate)) AS dategroup
    FROM @Dt) AS DR
GROUP BY dategroup;  

This returns:

Start                     End
2010-07-26 01:00:03.000 2010-07-26 01:00:07.000
2010-07-26 01:00:25.000 2010-07-26 01:00:25.000
2010-07-26 01:00:50.000 2010-07-26 01:01:04.000
2010-07-26 01:50:37.000 2010-07-26 01:50:44.000
2010-07-27 02:20:37.000 2010-07-27 02:20:44.000

Just throwing this one in because it was driving me crazy ;)

---------- EDIT
This works because the date increment is a consistent 1 (every second), the same increment as the row numbering. Therefore date - row number seconds will produce a value that is the same for a set of contiguous dates - hence a grouping key.

1:00:03 - 1 = 1:00:02
1:00:04 - 2 = 1:00:02
1:00:05 - 3 = 1:00:02
1:00:06 - 4 = 1:00:02
1:00:07 - 5 = 1:00:02
1:00:25 - 6 = 1:00:19
1:00:50 - 7 = 1:00:43
1:00:51 - 8 = 1:00:43
1:00:52 - 9 = 1:00:43
...

It is evident then that the first five entries are from the 1:00:02 "group" and the 6th belongs to the 1:00:19 "group" and so on down the list.
more ▼

answered Aug 04, 2010 at 06:40 AM

Blackhawk-17 gravatar image

Blackhawk-17
11.9k 28 31 36

@Blackhawk-17 +1 This is a very elegant solution! I can see that it is universal enough to also work even if there were multiple records within one second, much like Phil's solution which handles this scenario. For example, if you have records like

26/07/2010 01:00:03.000
26/07/2010 01:00:03.000
26/07/2010 01:00:03.237
26/07/2010 01:00:03.693
You could then drop the milliseconds part, pre-group by date, which would collapse the records like the 4 above into one, and then do your magic with dateadd(second, - row_number()...
Aug 04, 2010 at 07:52 AM Oleg
(comments are locked)
10|1200 characters needed characters left
where 
(
    eventdate - dateadd(day, datediff(day, 0, date_col), 0) >= '01:00:03'
and eventdate - dateadd(day, datediff(day, 0, date_col), 0) <  '01:00:08'
)
OR
(
    eventdate - dateadd(day, datediff(day, 0, date_col), 0) >= '01:50:37'
and eventdate - dateadd(day, datediff(day, 0, date_col), 0) <  '01:50:45'
)
OR
(
    eventdate - dateadd(day, datediff(day, 0, date_col), 0) >= '02:20:37'
and eventdate - dateadd(day, datediff(day, 0, date_col), 0) <  '02:20:45'
)
more ▼

answered Jul 28, 2010 at 07:02 PM

Squirrel gravatar image

Squirrel
2.1k 1 2 4

Maybe I should have worded the question better. I want to find the values specified rather than having to hard code them into the query.
Jul 28, 2010 at 07:14 PM James Hurburgh
(comments are locked)
10|1200 characters needed characters left
select  min(datecol), max(datecol)
from    yourtable
group by convert(smalldatetime, datecol)
more ▼

answered Jul 28, 2010 at 09:25 PM

Squirrel gravatar image

Squirrel
2.1k 1 2 4

+1 @Squirrel, Less Grouping!
Jul 28, 2010 at 09:29 PM Cyborg

alternatively you can also truncate off the secs & msec and

group by dateadd(minute, datediff(minute, 0, datecol), 0)
Jul 28, 2010 at 09:37 PM Squirrel
(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

New code box

There's a new way to format code on the site - the red speech bubble logo will automatically format T-SQL for you. The original code box is still there for XML, etc. More details here.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.

Topics:

x984
x369
x94

asked: Jul 28, 2010 at 06:35 PM

Seen: 2047 times

Last Updated: Jul 28, 2010 at 07:20 PM