|
The query is thus; I have a set of logs in the database with a datetime indicating when they were processed. Our service, being not very reliable, seems to error a chunk of tasks over a certain amount of time. Now, a week later, I'd like to quickly determine when the service was erroring so we can try to find the log files for those tasks logged at those times and try to reprocess them. For example if the datetimes were:
I need a query that would return:
So that I can use these datetimes to create a batch script to robocopy the logged files into a local directory to be reprocessed. How hard is this?
(comments are locked)
|
|
I've made the test data a little bit more realistic as it has an error overlapping a second boundary. This solution relies on the data being inserted into the heap in datetime order +1 This is perfect. I commented on Cyborg's solution that it only works by accident. This is a rather impressive quirky update, @Contiguity's "before" is read, and "after" is set to become read "before" on the next step...
Jul 29 '10 at 11:23 AM
Oleg
If you don't like quirky update, you can do it with a number table but it gets kinda complicated. Yes, the original test data was flawed, making the solution far too easy: and wrong! In real life, you want to see where errors/events get bunched and you want to specify what makes a bunch. This solution allows that.
Jul 29 '10 at 11:37 AM
Phil Factor
@Phil Factor I like quirky update, really do. So far, it only failed me (for a moment) one time when I used the technique to concatenate values from a rather large set, but is was still easy to figure out from the execution plan that the failure was caused by the sort operation taking place after, not (desired) before, so adding the "cheesy" top [big_number] ... order by ... in the subselect cured the problem.
Jul 29 '10 at 11:50 AM
Oleg
What would it take to handle where the "bunch" is a single entry as in '26/07/2010 1:00:25' which was not picked up in your result set?
Jul 29 '10 at 12:53 PM
Blackhawk-17
@Blackhawk-17 Just remove the HAVING COUNT(*) > 1, this will include it with same min and max :)
Jul 29 '10 at 01:43 PM
Oleg
(comments are locked)
|
Remarks: Dt Is the Table Name and TestDate is the column Name I hope you don't mind that I added code tags.
Jul 28 '10 at 09:21 PM
TimothyAWiseman
Perfect, worked like a charm. Thanks!
Jul 28 '10 at 09:23 PM
James Hurburgh
@TimothyAWiseman Good work! every thing is coming in same line when i am giving the answer, thats why i given an extra new line with it.
Jul 28 '10 at 09:26 PM
Cyborg
@James Hurburgh @Cyborg This only works by accident; however, if your data is always like this then I guess it is fine. What I mean is that if you have a process which starts at the beginning of the minute, breaks (so you have a hole in the data) then picks up again at the end of the minute, the logic as it is in the answer will hide the problem by selecting 01 as minimum and 59 as maximum. Additionally, if the process starts at the end of the minute and continues to the next minute, you will get an extra record in the results again not matching your desired output. The accurate solution could require a left join of the tally (with records resembling your table without the holes) and your table. This would allow you to create visible islands of the missing data and go from there. Just my 2 cents.
Jul 29 '10 at 08:35 AM
Oleg
@Cyborg - formatting code problems? You need a blank line between your text and your code, and to indent the code by at least 4 spaces or a tab.
Jul 29 '10 at 08:47 AM
ThomasRushton ♦
(comments are locked)
|
|
OK, Here goes a better attempt without a quirky update... Again using @Phil Factor's test data: This returns: Just throwing this one in because it was driving me crazy ;) ---------- EDIT 1:00:03 - 1 = 1:00:02 It is evident then that the first five entries are from the 1:00:02 "group" and the 6th belongs to the 1:00:19 "group" and so on down the list. @Blackhawk-17 +1 This is a very elegant solution! I can see that it is universal enough to also work even if there were multiple records within one second, much like Phil's solution which handles this scenario. For example, if you have records like 26/07/2010 01:00:03.000 26/07/2010 01:00:03.000 26/07/2010 01:00:03.237 26/07/2010 01:00:03.693 You could then drop the milliseconds part, pre-group by date, which would collapse the records like the 4 above into one, and then do your magic with dateadd(second, - row_number()...
Aug 04 '10 at 07:52 AM
Oleg
(comments are locked)
|
|
Borrowing heavily from Phil Factor's answer I have a partial solution but one that raises my own question. I can get (sort of) the start and end times of the error bunches and I also get the single entry. I do this without the quirky update (which I thought was a cool and elegant solution). I can "see" where the breaks are because of the sequential numbers but I just can't get the output the OP requested. How do I take this to the next level? The following result set is produced: While it is intuitively obvious where the start and end of the bunches are I can't take it logically forward for the final solution to the OP's question,
(comments are locked)
|
Maybe I should have worded the question better. I want to find the values specified rather than having to hard code them into the query.
Jul 28 '10 at 07:14 PM
James Hurburgh
(comments are locked)
|
1 2 next page »

