question

James Hurburgh avatar image
James Hurburgh asked

Selecting multiple ranges from a set of dates

The query is thus; I have a set of logs in the database with a datetime indicating when they were processed. Our service, being not very reliable, seems to error a chunk of tasks over a certain amount of time. Now, a week later, I'd like to quickly determine when the service was erroring so we can try to find the log files for those tasks logged at those times and try to reprocess them. For example if the datetimes were: - 26/07/2010 1:00:03 - 26/07/2010 1:00:04 - 26/07/2010 1:00:05 - 26/07/2010 1:00:06 - 26/07/2010 1:00:07 - 26/07/2010 1:50:37 - 26/07/2010 1:50:38 - 26/07/2010 1:50:39 - 26/07/2010 1:50:40 - 26/07/2010 1:50:41 - 26/07/2010 1:50:42 - 26/07/2010 1:50:43 - 26/07/2010 1:50:44 - 27/07/2010 2:20:37 - 27/07/2010 2:20:38 - 27/07/2010 2:20:39 - 27/07/2010 2:20:40 - 27/07/2010 2:20:41 - 27/07/2010 2:20:42 - 27/07/2010 2:20:43 - 27/07/2010 2:20:44 I need a query that would return: - 26/07/2010 1:00:03 - 26/07/2010 1:00:07 - 26/07/2010 1:50:37 - 26/07/2010 1:50:44 - 27/07/2010 2:20:37 - 27/07/2010 2:20:44 So that I can use these datetimes to create a batch script to robocopy the logged files into a local directory to be reprocessed. How hard is this?
t-sqlquerydatetime
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Phil Factor avatar image
Phil Factor answered
I've made the test data a little bit more realistic as it has an error overlapping a second boundary. This solution relies on the data being inserted into the heap in datetime order declare @Dt table (TestDate DateTime, contiguity int) insert into @dt (testdate) values (convert(datetime,'26/07/2010 1:00:03',103)), (convert(datetime,'26/07/2010 1:00:04',103)), (convert(datetime,'26/07/2010 1:00:05',103)), (convert(datetime,'26/07/2010 1:00:06',103)), (convert(datetime,'26/07/2010 1:00:07',103)), (convert(datetime,'26/07/2010 1:00:25',103)), (convert(datetime,'26/07/2010 1:00:50',103)), (convert(datetime,'26/07/2010 1:00:51',103)), (convert(datetime,'26/07/2010 1:00:52',103)), (convert(datetime,'26/07/2010 1:00:53',103)), (convert(datetime,'26/07/2010 1:00:54',103)), (convert(datetime,'26/07/2010 1:00:55',103)), (convert(datetime,'26/07/2010 1:00:56',103)), (convert(datetime,'26/07/2010 1:00:57',103)), (convert(datetime,'26/07/2010 1:00:58',103)), (convert(datetime,'26/07/2010 1:00:59',103)), (convert(datetime,'26/07/2010 1:01:00',103)), (convert(datetime,'26/07/2010 1:01:01',103)), (convert(datetime,'26/07/2010 1:01:02',103)), (convert(datetime,'26/07/2010 1:01:03',103)), (convert(datetime,'26/07/2010 1:01:04',103)), (convert(datetime,'26/07/2010 1:50:37',103)), (convert(datetime,'26/07/2010 1:50:38',103)), (convert(datetime,'26/07/2010 1:50:39',103)), (convert(datetime,'26/07/2010 1:50:40',103)), (convert(datetime,'26/07/2010 1:50:41',103)), (convert(datetime,'26/07/2010 1:50:42',103)), (convert(datetime,'26/07/2010 1:50:43',103)), (convert(datetime,'26/07/2010 1:50:44',103)), (convert(datetime,'27/07/2010 2:20:37',103)), (convert(datetime,'27/07/2010 2:20:38',103)), (convert(datetime,'27/07/2010 2:20:39',103)), (convert(datetime,'27/07/2010 2:20:40',103)), (convert(datetime,'27/07/2010 2:20:41',103)), (convert(datetime,'27/07/2010 2:20:42',103)), (convert(datetime,'27/07/2010 2:20:43',103)), (convert(datetime,'27/07/2010 2:20:44',103)) DECLARE @LastDate DATETIME , @Contiguity INT SELECT @LastDate = '1 Jan 2010' , @Contiguity = 0 UPDATE @dt SET @Contiguity = contiguity = CASE WHEN DATEDIFF(second, @LastDate, TestDate) 1 /* ----------------------- ----------------------- 2010-07-26 01:00:03.000 2010-07-26 01:00:07.000 2010-07-26 01:00:50.000 2010-07-26 01:01:04.000 2010-07-26 01:50:37.000 2010-07-26 01:50:44.000 2010-07-27 02:20:37.000 2010-07-27 02:20:44.000 (4 row(s) affected) */
7 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Oleg avatar image Oleg commented ·
+1 This is perfect. I commented on Cyborg's solution that it only works by accident. This is a rather impressive quirky update, @Contiguity's "before" is read, and "after" is set to become read "before" on the next step...
0 Likes 0 ·
Phil Factor avatar image Phil Factor commented ·
If you don't like quirky update, you can do it with a number table but it gets kinda complicated. Yes, the original test data was flawed, making the solution far too easy: and wrong! In real life, you want to see where errors/events get bunched and you want to specify what makes a bunch. This solution allows that.
0 Likes 0 ·
Oleg avatar image Oleg commented ·
@Phil Factor I like quirky update, really do. So far, it only failed me (for a moment) one time when I used the technique to concatenate values from a rather large set, but is was still easy to figure out from the execution plan that the failure was caused by the sort operation taking place after, not (desired) before, so adding the "cheesy" top [big_number] ... order by ... in the subselect cured the problem.
0 Likes 0 ·
Blackhawk-17 avatar image Blackhawk-17 commented ·
What would it take to handle where the "bunch" is a single entry as in '26/07/2010 1:00:25' which was not picked up in your result set?
0 Likes 0 ·
Oleg avatar image Oleg commented ·
@Blackhawk-17 Just remove the HAVING COUNT(*) > 1, this will include it with same min and max :)
0 Likes 0 ·
Blackhawk-17 avatar image Blackhawk-17 commented ·
@Oleg - now I'm sorry I posted the answer I just did. D'OH!
0 Likes 0 ·
Scot Hauder avatar image Scot Hauder commented ·
Don't forget to create a clustered index on TestDate and set MAXDOP=1
0 Likes 0 ·
Squirrel avatar image
Squirrel answered
where ( eventdate - dateadd(day, datediff(day, 0, date_col), 0) >= '01:00:03' and eventdate - dateadd(day, datediff(day, 0, date_col), 0) = '01:50:37' and eventdate - dateadd(day, datediff(day, 0, date_col), 0) = '02:20:37' and eventdate - dateadd(day, datediff(day, 0, date_col), 0)
1 comment
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

James Hurburgh avatar image James Hurburgh commented ·
Maybe I should have worded the question better. I want to find the values specified rather than having to hard code them into the query.
0 Likes 0 ·
Cyborg avatar image
Cyborg answered
SELECT MIN(TestDate),MAX(TestDate) FROM Dt GROUP BY CONVERT(VARCHAR,TestDate,101),DATEPART(HH,TEstDate),DATEPART(Mi,TestDate) Remarks: Dt Is the Table Name and TestDate is the column Name
5 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Oleg avatar image Oleg commented ·
@James Hurburgh @Cyborg This only works by accident; however, if your data is always like this then I guess it is fine. What I mean is that if you have a process which starts at the beginning of the minute, breaks (so you have a hole in the data) then picks up again at the end of the minute, the logic as it is in the answer will hide the problem by selecting 01 as minimum and 59 as maximum. Additionally, if the process starts at the end of the minute and continues to the next minute, you will get an extra record in the results again not matching your desired output. The accurate solution could require a left join of the tally (with records resembling your table without the holes) and your table. This would allow you to create visible islands of the missing data and go from there. Just my 2 cents.
1 Like 1 ·
TimothyAWiseman avatar image TimothyAWiseman commented ·
I hope you don't mind that I added code tags.
0 Likes 0 ·
James Hurburgh avatar image James Hurburgh commented ·
Perfect, worked like a charm. Thanks!
0 Likes 0 ·
Cyborg avatar image Cyborg commented ·
@TimothyAWiseman Good work! every thing is coming in same line when i am giving the answer, thats why i given an extra new line with it.
0 Likes 0 ·
ThomasRushton avatar image ThomasRushton ♦♦ commented ·
@Cyborg - formatting code problems? You need a blank line between your text and your code, and to indent the code by at least 4 spaces or a tab. It even works in comments. ;-)
0 Likes 0 ·
Squirrel avatar image
Squirrel answered
select min(datecol), max(datecol) from yourtable group by convert(smalldatetime, datecol)
2 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Cyborg avatar image Cyborg commented ·
+1 @Squirrel, Less Grouping!
0 Likes 0 ·
Squirrel avatar image Squirrel commented ·
alternatively you can also truncate off the secs & msec and group by dateadd(minute, datediff(minute, 0, datecol), 0)
0 Likes 0 ·
Blackhawk-17 avatar image
Blackhawk-17 answered
Borrowing heavily from Phil Factor's answer I have a partial solution but one that raises my own question. I can get (sort of) the start and end times of the error bunches and I also get the single entry. I do this without the quirky update (which I thought was a cool and elegant solution). I can "see" where the breaks are because of the sequential numbers but I just can't get the output the OP requested. How do I take this to the next level? declare @Dt table (TestDate DateTime) insert into @dt (testdate) values (convert(datetime,'26/07/2010 1:00:03',103)), (convert(datetime,'26/07/2010 1:00:04',103)), (convert(datetime,'26/07/2010 1:00:05',103)), (convert(datetime,'26/07/2010 1:00:06',103)), (convert(datetime,'26/07/2010 1:00:07',103)), (convert(datetime,'26/07/2010 1:00:25',103)), (convert(datetime,'26/07/2010 1:00:50',103)), (convert(datetime,'26/07/2010 1:00:51',103)), (convert(datetime,'26/07/2010 1:00:52',103)), (convert(datetime,'26/07/2010 1:00:53',103)), (convert(datetime,'26/07/2010 1:00:54',103)), (convert(datetime,'26/07/2010 1:00:55',103)), (convert(datetime,'26/07/2010 1:00:56',103)), (convert(datetime,'26/07/2010 1:00:57',103)), (convert(datetime,'26/07/2010 1:00:58',103)), (convert(datetime,'26/07/2010 1:00:59',103)), (convert(datetime,'26/07/2010 1:01:00',103)), (convert(datetime,'26/07/2010 1:01:01',103)), (convert(datetime,'26/07/2010 1:01:02',103)), (convert(datetime,'26/07/2010 1:01:03',103)), (convert(datetime,'26/07/2010 1:01:04',103)), (convert(datetime,'26/07/2010 1:50:37',103)), (convert(datetime,'26/07/2010 1:50:38',103)), (convert(datetime,'26/07/2010 1:50:39',103)), (convert(datetime,'26/07/2010 1:50:40',103)), (convert(datetime,'26/07/2010 1:50:41',103)), (convert(datetime,'26/07/2010 1:50:42',103)), (convert(datetime,'26/07/2010 1:50:43',103)), (convert(datetime,'26/07/2010 1:50:44',103)), (convert(datetime,'27/07/2010 2:20:37',103)), (convert(datetime,'27/07/2010 2:20:38',103)), (convert(datetime,'27/07/2010 2:20:39',103)), (convert(datetime,'27/07/2010 2:20:40',103)), (convert(datetime,'27/07/2010 2:20:41',103)), (convert(datetime,'27/07/2010 2:20:42',103)), (convert(datetime,'27/07/2010 2:20:43',103)), (convert(datetime,'27/07/2010 2:20:44',103)) SELECT MIN(TestDate) AS [Date], 1 AS [Sequence] FROM @Dt UNION ALL SELECT dt1.TestDate, dt1.theOrder FROM ( SELECT TestDate, ROW_NUMBER() OVER(ORDER BY TestDate) AS theOrder FROM @Dt ) AS dt1 INNER JOIN ( SELECT TestDate, ROW_NUMBER() OVER(ORDER BY TestDate) AS theOrder FROM @Dt ) AS dt2 ON dt1.theOrder + 1 = dt2.theOrder INNER JOIN ( SELECT TestDate, ROW_NUMBER() OVER(ORDER BY TestDate) AS theOrder FROM @Dt ) AS dt3 ON dt1.theOrder - 1 = dt3.theOrder WHERE DATEADD(second, -1, dt1.TestDate) dt3.TestDate OR DATEADD(second, 1, dt1.TestDate) dt2.TestDate UNION ALL SELECT MAX(TestDate), COUNT(*) FROM @Dt The following result set is produced: Date Sequence 2010-07-26 01:00:03.000 1 2010-07-26 01:00:07.000 5 2010-07-26 01:00:25.000 6 2010-07-26 01:00:50.000 7 2010-07-26 01:01:04.000 21 2010-07-26 01:50:37.000 22 2010-07-26 01:50:44.000 29 2010-07-27 02:20:37.000 30 2010-07-27 02:20:44.000 37 While it is intuitively obvious where the start and end of the bunches are I can't take it logically forward for the final solution to the OP's question,
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Blackhawk-17 avatar image
Blackhawk-17 answered
OK, Here goes a better attempt without a quirky update... Again using @Phil Factor's test data: declare @Dt table (TestDate DateTime) insert into @dt (testdate) values (convert(datetime,'26/07/2010 1:00:03',103)), (convert(datetime,'26/07/2010 1:00:04',103)), (convert(datetime,'26/07/2010 1:00:05',103)), (convert(datetime,'26/07/2010 1:00:06',103)), (convert(datetime,'26/07/2010 1:00:07',103)), (convert(datetime,'26/07/2010 1:00:25',103)), (convert(datetime,'26/07/2010 1:00:50',103)), (convert(datetime,'26/07/2010 1:00:51',103)), (convert(datetime,'26/07/2010 1:00:52',103)), (convert(datetime,'26/07/2010 1:00:53',103)), (convert(datetime,'26/07/2010 1:00:54',103)), (convert(datetime,'26/07/2010 1:00:55',103)), (convert(datetime,'26/07/2010 1:00:56',103)), (convert(datetime,'26/07/2010 1:00:57',103)), (convert(datetime,'26/07/2010 1:00:58',103)), (convert(datetime,'26/07/2010 1:00:59',103)), (convert(datetime,'26/07/2010 1:01:00',103)), (convert(datetime,'26/07/2010 1:01:01',103)), (convert(datetime,'26/07/2010 1:01:02',103)), (convert(datetime,'26/07/2010 1:01:03',103)), (convert(datetime,'26/07/2010 1:01:04',103)), (convert(datetime,'26/07/2010 1:50:37',103)), (convert(datetime,'26/07/2010 1:50:38',103)), (convert(datetime,'26/07/2010 1:50:39',103)), (convert(datetime,'26/07/2010 1:50:40',103)), (convert(datetime,'26/07/2010 1:50:41',103)), (convert(datetime,'26/07/2010 1:50:42',103)), (convert(datetime,'26/07/2010 1:50:43',103)), (convert(datetime,'26/07/2010 1:50:44',103)), (convert(datetime,'27/07/2010 2:20:37',103)), (convert(datetime,'27/07/2010 2:20:38',103)), (convert(datetime,'27/07/2010 2:20:39',103)), (convert(datetime,'27/07/2010 2:20:40',103)), (convert(datetime,'27/07/2010 2:20:41',103)), (convert(datetime,'27/07/2010 2:20:42',103)), (convert(datetime,'27/07/2010 2:20:43',103)), (convert(datetime,'27/07/2010 2:20:44',103)) SELECT MIN(TestDate) AS [Start], MAX(TestDate) as [End] FROM ( SELECT TestDate, (DATEADD(second, - ROW_NUMBER() OVER(ORDER BY TestDate), TestDate)) AS dategroup FROM @Dt) AS DR GROUP BY dategroup; This returns: Start End 2010-07-26 01:00:03.000 2010-07-26 01:00:07.000 2010-07-26 01:00:25.000 2010-07-26 01:00:25.000 2010-07-26 01:00:50.000 2010-07-26 01:01:04.000 2010-07-26 01:50:37.000 2010-07-26 01:50:44.000 2010-07-27 02:20:37.000 2010-07-27 02:20:44.000 Just throwing this one in because it was driving me crazy ;) ---------- **EDIT** This works because the date increment is a consistent 1 (every second), the same increment as the row numbering. Therefore date - row number seconds will produce a value that is the same for a set of contiguous dates - hence a grouping key. 1:00:03 - 1 = 1:00:02 1:00:04 - 2 = 1:00:02 1:00:05 - 3 = 1:00:02 1:00:06 - 4 = 1:00:02 1:00:07 - 5 = 1:00:02 1:00:25 - 6 = 1:00:19 1:00:50 - 7 = 1:00:43 1:00:51 - 8 = 1:00:43 1:00:52 - 9 = 1:00:43 ... It is evident then that the first five entries are from the 1:00:02 "group" and the 6th belongs to the 1:00:19 "group" and so on down the list.
1 comment
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Oleg avatar image Oleg commented ·
@Blackhawk-17 +1 This is a very elegant solution! I can see that it is universal enough to also work even if there were multiple records within one second, much like Phil's solution which handles this scenario. For example, if you have records like
26/07/2010 01:00:03.000
26/07/2010 01:00:03.000
26/07/2010 01:00:03.237
26/07/2010 01:00:03.693
You could then drop the milliseconds part, pre-group by date, which would collapse the records like the 4 above into one, and then do your magic with dateadd(second, - row_number()...
0 Likes 0 ·

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.