question

andredai avatar image
andredai asked

find ranges in groups which are identified by a flag

Hello, I have the following table: declare @table table (dates int, is_missing tinyint, group_id numeric(18)) insert into @table(dates, is_missing, group_id) select 20110719,0,1 union all select 20110720,0,1 union all select 20110721,0,1 union all select 20110722,1,1 union all select 20110723,0,1 union all select 20110724,0,1 union all select 20110725,0,1 union all select 20110726,1,1 union all select 20110727,0,1 union all select 20110728,1,1 union all select 20110728,1,3 union all select 20110729,1,3 union all select 20110730,0,3 union all select 20110731,1,3 union all select 20110801,0,3 union all select 20110802,0,3 union all select 20110803,0,3 select * from @table order by group_id, dates What I am trying to do is to return ranges of dates for each group which are identified by the missing day flag. To make this more clear the results of the query will have to look like this: group_id start_date end_date days_count 1 20110719 20110721 3 1 20110723 20110725 3 1 20110727 20110727 1 3 20110730 20110730 1 3 20110801 20110803 3 The is_missing flag basicaly separates the ranges per group. It actually says that a date is missing and therefore all the other dates located between is_missing flags are the groups I am trying to find their start and end dates as well as their days numbers count. Is there a simple way to do this? Thanks a lot.
sql-server-2008t-sqlsql-server
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

1 Answer

·
WilliamD avatar image
WilliamD answered
You can go about it using windowing functions, using the idea of islands and gaps. You identify the groups of consecutive dates inside of a `group_id`, then find the minimum, maximum and count for this "pseudo-group". Translated into T-SQL: WITH dataset AS (SELECT *, rn = ROW_NUMBER() OVER (PARTITION BY group_id ORDER BY dates) FROM @table ), datasetext AS (SELECT dates, is_missing, group_id, grp = group_id + (rn - ROW_NUMBER() OVER (PARTITION BY group_id ORDER BY dates)) -- identify the consecutive dates inside a group_id FROM dataset WHERE is_missing = 0), grouper AS (SELECT dates, is_missing, group_id, grp, rn = ROW_NUMBER() OVER (PARTITION BY grp ORDER BY dates) -- create a new grouping id FROM datasetext) /* display the results */ SELECT group_id, MIN(dates) as start_date, MAX(dates) as end_date, MAX(rn) as days_count FROM grouper GROUP BY grp, group_id That should do what you want.
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.