question

skarcrow avatar image
skarcrow asked

Select busiest 23 days

Hope you can help. I have a hire database which basically records the id_no together with a datetime of issue (only really interested in days) e.g 20th Aug 2010 - id 2000, 21st Aug 2010 id 2000, I've been asked to produde a report showing the busiest 23 day period for each ID in the table and the total items for that busiest period eg id 2000, busiest 23 day period started 1st Sept 2000, 290 Items used in that period Hope this makes sense :)
t-sql
6 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Oleg avatar image Oleg commented ·
Could you please produce some sample data, just a small subset for a couple of different id_no? Otherwise, it is somewhat difficult to figure out what do you mean by the busiest 23 day period. Does it have to be contiguous? Does the data exists for every day? etc...
4 Likes 4 ·
ThomasRushton avatar image ThomasRushton ♦♦ commented ·
Going by what the OP says, "busiest 23 day period *starting*..." implies contiguous period... ...and thinking about the problem is giving me a headache.
0 Likes 0 ·
Mark avatar image Mark commented ·
I know what you mean Thomas - me too! But the solution would be interesting - how to in effect group each possible 23 day time frame.
0 Likes 0 ·
ThomasRushton avatar image ThomasRushton ♦♦ commented ·
Agreed. I've got a couple of ideas for how to approach this, but none of them are particularly pretty. @skarcrow - it would help if you could just give us the schema for the table in question...
0 Likes 0 ·
Oleg avatar image Oleg commented ·
@skarcrow I think that I was able to get the results you want. Please let me know if my solution works. Thanks.
0 Likes 0 ·
Oleg avatar image Oleg commented ·
@Mark The only solution I could figure on Friday late night is a brutal force cross apply of the subset of table's records (limited to the period's time frame) for each available period. The number of periods is not too big though (343 periods for each year worth of data), so it should be OK.
0 Likes 0 ·

1 Answer

·
Oleg avatar image
Oleg answered
I hope that I understand the requirements correctly. There is a table with 2 columns of interest: **id\_no** int and **issue\_date** datetime. There might be many records for the same **id\_no** and the **issue\_date** values maybe for the same or different dates. The question is to write a query to find the best 23 day period for each **id\_no** based on the number of records within the period. For example, if records with a specific **id\_no** has a single entry for each day in January except 3rd, 15th and 25th and 30th when it has 2 records for each day then the best period for this id_no would be from January 3rd to January 25th as this is the only period which will include 3 of the 4 available spikes. The query in this solution depends on the heap created on the following idea: it has one record for each potential period specifying first and last date of the period. For example, if the table in question has records from January 1st until August 31st then the heap will have 221 records defining periods like this: - first period from 2010-01-01 at midnight to 2010-01-23 23:59:59.997 - second period from 2010-01-02 at midnight to 2010-01-24 23:59:59.997 - etc... - 221st period from 2010-08-09 at midnight to 2010-08-31 23:59:59.997 So each period includes 23 days - 3 milliseconds to avoid errors associated with miscalculating results should the issue\_date of any record in the table in question have 00:00:00.000 as its respective time component. The reason why 3 milliseconds are chosen are explained in one of my previous answers which includes dissection of datetime internals: [ http://ask.sqlservercentral.com/questions/16420/php-with-mssql-strtotime-with-mssql-datetime-column][1] The table is interrogated using brutal force cross apply for each 23 days period. One of the columns returned from this cross apply is row_number() over(partition by id_no order by id_no, item_count desc) What this does is that it assigns the record number of aggregated by id_no data and restarts the numbering the moment this id changes. This allows record with highest items for each id to be numbered as 1, and this is the key to this solution. Here is the final script: declare @start_date datetime; declare @end_date datetime; declare @n int; select @start_date = dateadd(day, datediff(day, 0, min(issue_date)), -1), @n = datediff(day, min(issue_date), max(issue_date)) - 21 from dbo.SampleTable; ;with records (first_date, last_date, id_no, item_count, num) as ( select period.first_date, period.last_date, t.id_no, t.item_count, row_number() over(partition by id_no order by id_no, item_count desc ) num from ( select top (@n) dateadd(day, row_number() over (order by [object_id]), @start_date) first_date, dateadd(millisecond, -3, dateadd(day, row_number() over (order by [object_id]) + 23, @start_date)) last_date from sys.all_columns ) period cross apply ( select id_no, count(1) item_count from dbo.SampleTable where issue_date between period.first_date and period.last_date group by id_no ) t ) select first_date, last_date, id_no, item_count from records where num = 1 order by id_no; Please change the table name and column names as needed. Since I don't have any sample data for this solution, please let me know whether the query performance is acceptable. Oleg P.S. This solution will work only if you have SQL Server 2005 or better with compat 90 or better. [1]: http://ask.sqlservercentral.com/questions/16420/php-with-mssql-strtotime-with-mssql-datetime-column
8 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Oleg avatar image Oleg commented ·
@skarcrow You are welcome, I am glad that it works as expected. The best thing about this website is that usually you don't have to wait too long to get an answer. For me ASK is the best learning resource regarless whether I am asking a question, trying to answer one myself or reading existing answers.
2 Likes 2 ·
WilliamD avatar image WilliamD commented ·
@Oleg - +1. I love how you explain your solution, this is your hallmark on ASK - constantly good answers with brilliant background detail.
1 Like 1 ·
skarcrow avatar image skarcrow commented ·
@Oleg - thanks for the detailed reply - I've been away for the weekend :) - You have hit the nail on the head. I'm using SQL server 2005 :) I'm still a novice writing SQL , although been programming for 20 Yrs +. I wasn't expecting a reply so quickly - thanks again :)
0 Likes 0 ·
Mark avatar image Mark commented ·
Fantastic work there Oleg! (+1)
0 Likes 0 ·
Mark avatar image Mark commented ·
I would like to use this, except with calculating continuous -business- days. Hmm... Maybe that should be a separate question though.
0 Likes 0 ·
Show more comments

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.