question

R2d2 avatar image
R2d2 asked

tsql flattern/merge non-overlapping periods

[link text][1] Dear Reader, since over 2 days now i try figuring out a way to merge flattern the Information down so if i try print a schedule with all These periods they will be merged into as less lines as possible. See on pic below: ![alt text][2] i thought about writing a procedre with pointer and so on but wouldn't it be easier and hopefully faster to not using pointers, just tsql cte's? ` I am trying to merge into one line (creating a virtual lineNo.) as many dateranges as possible as Long as they do not overlap. currently i got stuck at follwing lines: --############################################################################################ --SORT ALGORITHM SET @Counter_i=(SELECT max(ID_Grp) FROM @tbl_sort_promo ) SET @i=1 -- Grp Loop WHILE (1=1) BEGIN SET @k=1 WHILE (1=1) BEGIN SET @Counter_k=(SELECT Count(SID_Key) FROM @tbl_sort_promo WHERE ID_Grp=@i ) INSERT INTO @tbl_sort_promo_final (SID_Key, ID_Grp, LfdNo, ds_Grp, ds_ID, dt_Start, dt_End) SELECT a.SID_Key, @i as ID_Grp, @k as LfdNo,a.ds_Grp,a.ds_ID,a.dt_Start,a.dt_End FROM @tbl_sort_promo a LEFT JOIN @tbl_sort_promo c ON a.ID_Grp=c.ID_Grp AND a.LfdNo=c.LfdNo-1 WHERE a.ID_Grp=@i AND a.dt_Start< coalesce( c.dt_Start, dateadd(dd,1,a.dt_Start)) AND a.dt_End < coalesce( c.dt_Start, dateadd(dd,1,a.dt_End)) order by a.dt_Start, a.dt_End DELETE FROM @tbl_sort_promo where SID_Key IN (SELECT SID_Key FROM @tbl_sort_promo_final) SET @k=@k+1 IF coalesce(@Counter_k,0)=0 break; IF coalesce(@Counter_i,0)=0 break; END IF @Counter_i= @i break; IF coalesce(@Counter_i,0)=0 break; SET @i=@i+1 END Somehow the algorithm does not work properly. Can somebody give me a helping Hand? in the attachment there is a csv with sample data. in whole i Need to process from some 4000 rows up to 140000 rows as fast as possible. (under 1-2 seconds) Thanks for any help. Regards. R2 [1]: /storage/temp/2702-pic-example.csv [2]: /storage/temp/2705-example.png
sqltsqldatetimeproceduresorting
example.png (264.8 KiB)
pic-example.csv (493 B)
8 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Kev Riley avatar image Kev Riley ♦♦ commented ·
The data in the csv bears no resemblance to the image - can you provide the data that is in the example?
0 Likes 0 ·
R2d2 avatar image R2d2 commented ·
k, just uploaded the example data Fitting to the Image. Rgrds R2
0 Likes 0 ·
erlokeshsharma08 avatar image erlokeshsharma08 commented ·
How do you want groupline_no to be calculated, i mean what is the brief logic behind it? Thanks er
0 Likes 0 ·
R2d2 avatar image R2d2 commented ·
just check for all Dates in the current Group e.g. Grp "D" for all periods / timeranges not overlapping and assign them a lineNo, for the next timerange overlapping open a new lineNo and put again all lasting and not overlapping periods / timeranges of the current Group in this lineNo and so on. could i explain it in an understandable way? Regards R2
0 Likes 0 ·
Kev Riley avatar image Kev Riley ♦♦ commented ·
Why doesn't Opera (12/8/2015 - 31/12/2105) belong in GrpLine = 1 for Grp D? It doesn't overlap the other dates in GrpLine = 1?
0 Likes 0 ·
Show more comments
Kev Riley avatar image
Kev Riley answered
Here's a solution that uses a technique from Itzik Ben-Gan for finding overlapping time intervals (see http://sqlmag.com/blog/solutions-packing-date-and-time-intervals-puzzle for more info) It also requires something unique for each row, so in this example I added an identity to the table where the source data is, but you might already have something unique in your data. declare @YourTable table ( ID int identity(1,1), Grp char(1), Happening varchar(50), dt_start date, dt_end date ) insert into @YourTable select 'A','Concert','22 feb 2015','24 may 2015' insert into @YourTable select 'A','Festival','26 may 2015','20 july 2015' insert into @YourTable select 'B','openAir','20 may 2015','25 june 2015' insert into @YourTable select 'C','Concert','12 april 2015','10 may 2015' insert into @YourTable select 'D','Opera','12 august 2015','31 december 2015' insert into @YourTable select 'D','Concert','01 april 2015','30 june 2015' insert into @YourTable select 'D','Festival','20 april 2015','28 june 2015' insert into @YourTable select 'D','openAir','22 april 2015','24 may 2015' insert into @YourTable select 'D','Concert','26 may 2015','16 june 2015' insert into @YourTable select 'D','Festival','18 june 2015','22 june 2015' insert into @YourTable select 'D','Opera','01 july 2015','05 july 2015' insert into @YourTable select 'E','Concert','01 august 2015','15 august 2015' insert into @YourTable select 'E','Festival','07 august 2015','12 august 2015' insert into @YourTable select 'F','Concert','01 january 2015','12 feb 2015' ;WITH C1 AS ( SELECT ID,Grp, dt_start AS ts, +1 AS type, NULL AS e, ROW_NUMBER() OVER(PARTITION BY Grp ORDER BY dt_start) AS s FROM @YourTable UNION ALL SELECT ID,Grp, dt_end AS ts, -1 AS type, ROW_NUMBER() OVER(PARTITION BY Grp ORDER BY dt_end) AS e, NULL AS s FROM @YourTable ), C2 AS ( SELECT C1.*, ROW_NUMBER() OVER(PARTITION BY Grp ORDER BY ts, type DESC) AS se FROM C1 ), C3 AS ( SELECT Grp, ts, FLOOR((ROW_NUMBER() OVER(PARTITION BY Grp ORDER BY ID) - 1) / 2 + 1) AS grpnum, COALESCE(s - (se - s) - 1, (se - e) - e) + 1 as GrpLine_No FROM C2 ) ,C4 as ( SELECT Grp,min(GrpLine_No) as GrpLine_No, MIN(ts) AS starttime, max(ts) AS endtime FROM C3 GROUP BY Grp, grpnum ) select dense_rank()over(order by YT.Grp) as Grp_no, YT.Grp, C4.GrpLine_No, row_number()over(partition by YT.Grp, C4.GrpLine_No order by YT.dt_start) as HappeningNoInLine, YT.Happening, YT.dt_start, YT.dt_end from c4 join @YourTable YT on YT.Grp = C4.Grp and YT.dt_start = C4.starttime and YT.dt_end = C4.endtime order by YT.Grp, C4.GrpLine_No, YT.dt_start gives you Grp_no Grp GrpLine_No HappeningNoInLine Happening dt_start dt_end -------- ---- ----------- -------------------- ------------- ---------- ---------- 1 A 1 1 Concert 2015-02-22 2015-05-24 1 A 1 2 Festival 2015-05-26 2015-07-20 2 B 1 1 openAir 2015-05-20 2015-06-25 3 C 1 1 Concert 2015-04-12 2015-05-10 4 D 1 1 Concert 2015-04-01 2015-06-30 4 D 1 2 Opera 2015-07-01 2015-07-05 4 D 1 3 Opera 2015-08-12 2015-12-31 4 D 2 1 Festival 2015-04-20 2015-06-28 4 D 3 1 openAir 2015-04-22 2015-05-24 4 D 3 2 Concert 2015-05-26 2015-06-16 4 D 3 3 Festival 2015-06-18 2015-06-22 5 E 1 1 Concert 2015-08-01 2015-08-15 5 E 2 1 Festival 2015-08-07 2015-08-12 6 F 1 1 Concert 2015-01-01 2015-02-12 (14 row(s) affected) ----- **Edit :** to handle exactly overlapping events, I've had to rework the query .... but it actually makes it smaller.... ;WITH C1 AS ( SELECT ID,Grp, dt_start AS ts, +1 AS type, NULL AS e, ROW_NUMBER() OVER(PARTITION BY Grp ORDER BY dt_start,ID) AS s FROM @YourTable UNION ALL SELECT ID,Grp, dt_end AS ts, -1 AS type, ROW_NUMBER() OVER(PARTITION BY Grp ORDER BY dt_end, ID desc) AS e, NULL AS s FROM @YourTable ), C2 AS ( SELECT C1.*, ROW_NUMBER() OVER(PARTITION BY Grp ORDER BY ts, type DESC) AS se FROM C1 ), C3 AS ( SELECT ID, Grp, ts, COALESCE(s - (se - s) - 1, (se - e) - e) + 1 as GrpLine_No FROM C2 ) select dense_rank()over(order by YT.Grp) as Grp_no, YT.Grp, c4.GrpLine_No, row_number()over(partition by YT.Grp, C4.GrpLine_No order by YT.dt_start) as HappeningNoInLine, YT.Happening, YT.dt_start, YT.dt_end from @YourTable YT cross apply (select max(C3.GrpLine_No) GrpLine_No from c3 where C3.ID = YT.ID) c4
4 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

R2d2 avatar image R2d2 commented ·
Hi Kev, thank you very much, i already visited Ben-Gan's page but it didn`t work out the first time i tried to adapt his solution, this time i checked again the uniqueKeys, and look there... it was a lack of data quality, which in the first run i didn't check properly. Thank you very much again :) Best Regards R2d2
0 Likes 0 ·
R2d2 avatar image R2d2 commented ·
Hi Kev, i just recognized one case where the routine does not work. This is the Case if you have two Happenings in the same Group with same start and end Dates. insert into @YourTable select 'D','Opera','20 april 2015','28 june 2015' In this case the dates overlap but don't get a new GrpLineNo. Regards R2d2
0 Likes 0 ·
Kev Riley avatar image Kev Riley ♦♦ commented ·
Good spot - I've amended my answer to cope with that scenario - please test :)
0 Likes 0 ·
R2d2 avatar image R2d2 commented ·
Tested, it seems now its working as it is supposed to, thank you very much again. Regards R2d2
0 Likes 0 ·
erlokeshsharma08 avatar image
erlokeshsharma08 answered
I could come up with following logic :- ;with cte (grp_no, grp, grpline_no, happening_no, happening, dt_start, dt_end) as ( SELECT dense_RANK() over ( order by grp asc) as grp_no , [GRP], 1 as grpline_no, row_number() over (PARTITION by grp order by grp asc) as happening_no ,[happening] ,[dt_start] ,[dt_end] FROM [dbo].[orig_table] ) select * into #after_table from cte select * from #after_table drop table #after_table Working on the line_logic now
2 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

R2d2 avatar image R2d2 commented ·
Looks good so far, the most challanging question is, how i can process this set in the most efficient way to merge the Information to as less lines per Group as possible and as fast as possible. Thank you so far.
0 Likes 0 ·
erlokeshsharma08 avatar image erlokeshsharma08 R2d2 commented ·
From sql's perspective I think you can go for a temporary table (get rid of cte and declare/populate temp table instead). That can prove to be a quick process. Struggling with the line_no logic though :-(
0 Likes 0 ·

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.