x

A rather odd join conundrum

A rather odd join conundrum

Either i have not had enough coffee this morning, or i have ran into a bit of strange join scenario.

I have a CTE that's parsing millions of rows into a timesheet for a given weekending.

This timesheet is going to be rendered in web pages, emails, binaries etc, and rather than writing rendering code for all the various interfaces we'de like to fetch the data from SQL in semi-timesheet esque form (i.e. a branch/mon/tue/wed columns etc):

Branch Mon_Name Mon_Time Tue_Name Tue_Time 001 Bob 09-12 Bob 09-10 001 NULL NULL Jim 10-12 002 James 08-13 Terry 08-13 002 Mary 03-12 NULL NULL

The problem here is that every day may have a varibale number of bookings (Note the NULL's), i.e. on tueday, branch 1 might have 2 bookings (2 rows) but only 1 booking on the monday.

a stripped down version of my CTE is as follows:

 WITH    WeeksBookings ( DET_NUMBERA, FirstName, LastName, ReasonName, ReasonColour, BranchID, BookingDate, DayNumber, StartTime, EndTime )
       AS ( SELECT   B.DET_NUMBERA ,
                     RTRIM(C.DET_G1_NAME1A) ,
                     RTRIM(C.DET_SURNAMEA) ,
                     R.ReasonName ,
                     R.ReasonColour ,
                     B.BranchID ,
                     B.BookingDate ,
                     DATEPART(WEEKDAY, B.BookingDate) AS WeekDay ,
                     B.StartTime ,
                     B.EndTime
            FROM     tbl_Booking B
                     LEFT OUTER JOIN dbo.tbl_BookingReason R ON B.ReasonID = R.ReasonID
                     LEFT OUTER JOIN dbo.tbl_ChrisCache C ON B.DET_NUMBERA = C.DET_NUMBERA
            WHERE    CONVERT(VARCHAR(10), dbo.udf_WeekEndDate(BookingDate), 103) = '18/12/2011'
          ),
     BranchList ( BranchID )
       AS ( SELECT   BranchID
            FROM     WeeksBookings
            GROUP BY BranchID
          ),
     Bookings_Mon ( Name, ReasonName, ReasonColour, BranchID, StartTime, EndTime )
       AS ( SELECT   LEFT(FirstName,1) + '.' + LastName AS Name ,
                     ReasonName ,
                     ReasonColour ,
                     BranchID ,
                     StartTime ,
                     EndTime
            FROM     WeeksBookings
            WHERE    DayNumber = 1
          ),
     Bookings_Tue ( Name, ReasonName, ReasonColour, BranchID, StartTime, EndTime )
       AS ( SELECT   LEFT(FirstName,1) + '.' + LastName AS Name ,
                     ReasonName ,
                     ReasonColour ,
                     BranchID ,
                     StartTime ,
                     EndTime
            FROM     WeeksBookings
            WHERE    DayNumber = 2
          )
 SELECT  BranchList.BranchID ,
         Bookings_Mon.Name AS Mon_Name,
         Bookings_Mon.ReasonName AS Mon_Reason,
         Bookings_Mon.ReasonColour AS Mon_ReasonCol,
         Bookings_Mon.StartTime AS Mon_Start,
         Bookings_Mon.EndTime AS Mon_End,            
         Bookings_Tue.Name AS Tue_Name,
         Bookings_Tue.ReasonName AS Tue_Reason,
         Bookings_Tue.ReasonColour AS Tue_ReasonCol,
         Bookings_Tue.StartTime AS Tue_Start,
         Bookings_Tue.EndTime AS Tue_End
 FROM    BranchList
         LEFT OUTER JOIN Bookings_Mon ON BranchList.BranchID = Bookings_Mon.BranchID
         LEFT OUTER JOIN Bookings_Tue ON BranchList.BranchID = Bookings_Tue.BranchID
 ORDER BY BranchList.BranchID

In the above CTE, BranchList contains 2 rows "001" & "002", BookingsMon 60 rows, BookingsTue 57 rows. By joining just BranchList & Bookings_Mon i get a 60 row result set with the branchID column correctly duplicated to reflect the greater number of matching rows on the join. However, as soon as i join in tuesday, things go mental as its obviously joining the 57 rows in tuesday onto the now 60 rows resulting in 3420 rows...

Is there any way to do this kinda of side-by-side join in set based SQL, or am i going to have to go recursive? (temp tables, loops yada yada)

Thoughts welcome.

more ▼

asked Dec 16, 2011 at 05:05 AM in Default

avatar image

HeavenCore
89 4 7 9

can you provide table DDL and some sample data please?

Dec 16, 2011 at 05:11 AM Fatherjack ♦♦

@Fatherjack Sure can, might be a bit chunky, what is the best way to post it? (You’ve advised me not to post external links it the past)

Dec 16, 2011 at 05:28 AM HeavenCore

we would only need a few rows. enough to show how you want things summarised. as for the DDL, thats a bit trickier if the objects are big

Dec 16, 2011 at 07:39 AM Fatherjack ♦♦

@Fatherjack Ive created a DDL & sample data dump, you can grab it here: http://www.heavencore.co.uk/filehub/uploaded/SchemaAndSampleData.sql apologies for the link (its my website) but the sample data is really needed to demonstrate the problem. I'm still looking for a good solution, as Usmans solution below results in massive amounts of empty cells.

Dec 16, 2011 at 08:19 AM HeavenCore
(comments are locked)
10|1200 characters needed characters left

3 answers: sort voted first

If I were you, I would use CASE statement to get the desired output. No need of joins. If you would have provided the sample data and DDLs, I could have been more certain. You should do something like this

 WITH    WeeksBookings ( DET_NUMBERA, FirstName, LastName, ReasonName, ReasonColour, BranchID, BookingDate, DayNumber, StartTime, EndTime )
       AS ( SELECT   B.DET_NUMBERA ,
                     RTRIM(C.DET_G1_NAME1A) ,
                     RTRIM(C.DET_SURNAMEA) ,
                     R.ReasonName ,
                     R.ReasonColour ,
                     B.BranchID ,
                     B.BookingDate ,
                     DATEPART(WEEKDAY, B.BookingDate) AS WeekDay ,
                     B.StartTime ,
                     B.EndTime
            FROM     tbl_Booking B
                     LEFT OUTER JOIN dbo.tbl_BookingReason R ON B.ReasonID = R.ReasonID
                     LEFT OUTER JOIN dbo.tbl_ChrisCache C ON B.DET_NUMBERA = C.DET_NUMBERA
            WHERE    CONVERT(VARCHAR(10), dbo.udf_WeekEndDate(BookingDate), 103) = '18/12/2011'
          ),
     BranchList ( BranchID )
       AS ( SELECT   BranchID
            FROM     WeeksBookings
            GROUP BY BranchID
          )
 SELECT  BranchList.BranchID ,
         CASE WHEN DayNumber = 1 THEN LEFT(FirstName,1) + '.' + LastName ELSE '' END AS Mon_Name,
         CASE WHEN DayNumber = 1 THEN ReasonName ELSE '' END AS Mon_Reason,
         CASE WHEN DayNumber = 1 THEN ReasonColour  ELSE '' END AS Mon_ReasonCol,
         CASE WHEN DayNumber = 1 THEN StartTime ELSE NULL END AS Mon_Start,
         CASE WHEN DayNumber = 1 THEN EndTime ELSE NULL END AS Mon_End,            
         CASE WHEN DayNumber = 2 THEN LEFT(FirstName,1) + '.' + LastName ELSE '' END AS Tue_Name,
         CASE WHEN DayNumber = 2 THEN ReasonName ELSE '' END AS Tue_Reason,
         CASE WHEN DayNumber = 2 THEN ReasonColour  ELSE '' END AS Tue_ReasonCol,
         CASE WHEN DayNumber = 2 THEN StartTime ELSE NULL END AS Tue_Start,
         CASE WHEN DayNumber = 2 THEN EndTime ELSE NULL END AS Tue_End
 FROM    BranchList
         LEFT OUTER JOIN WeeksBookings ON BranchList.BranchID = WeeksBookings.BranchID
 ORDER BY BranchList.BranchID

Hope it helps.

more ▼

answered Dec 16, 2011 at 05:51 AM

avatar image

Usman Butt
13.9k 6 13 21

I was working on some sample data (I’m working on a dev copy of live data, so had to do some serious anonymizing (sp?) and stripping of sensitive data, your solution beat me to it! But as for your solution, it works perfectly! Well done!

Dec 16, 2011 at 05:57 AM HeavenCore
(comments are locked)
10|1200 characters needed characters left

Here's my attempt. I used PIVOT

 WITH Branch AS (SELECT BranchID FROM tbl_Booking GROUP BY BranchID)
 , Customer AS (SELECT DET_NUMBERA CustomerID, DET_G1_NAME1A CustFNm, DET_SURNAMEA CustLNm FROM tbl_ChrisCache)
 , Booking AS 
     (
         SELECT BranchID, WeekID, BookingID, c.CustomerID, c.CustFNm, DayID, DayName, StartTime, RowPos
         ,    MAX(RowPos) OVER(PARTITION BY BranchID, WeekID, DayID)  LastRowPos
         ,    MIN(RowPos) OVER(PARTITION BY BranchID, WeekID, DayID)  FirstRowPos
         FROM
         (
             SELECT BranchID, WeekID, BookingID, CustomerID, BookingDate, DayID, DayName, StartTime
             ,    ROW_NUMBER() OVER(PARTITION BY BranchID, WeekID, DayID ORDER BY CustomerID, StartTime) RowPos
             FROM
             (
             SELECT BranchID, BookingID, DET_NUMBERA CustomerID, BookingDate, 
                 DATEPART(WEEKDAY, BookingDate) DayID, LEFT(DATENAME(WEEKDAY, BookingDate), 3) DayName
             ,    StartTime, DATEPART(WEEK, BookingDate) WeekID
             FROM tbl_Booking b
             ) base
         ) aggs
          JOIN Customer c ON aggs.CustomerID = c.CustomerID
     )
 , BookingGridFirstPos AS
     (
         SELECT DISTINCT BranchID, WeekID, 1 as RowPos FROM Booking
     )
 , BookingGridLastPos AS
     (
         SELECT BranchID, WeekID, MAX(RowPos) LastRowPos FROM Booking
         GROUP BY BranchID, WeekID
     )    
 , BookingGrid AS
     (
         SELECT DISTINCT i.BranchID, i.WeekID, i.RowPos
         FROM Booking i JOIN BookingGridLastPos il on i.BranchID = il.BranchID and i.WeekID = il.WeekID and i.LastRowPos = il.LastRowPos
     )

 SELECT BranchID, WeekID, RowPos, 
         MAX(Mon_Name) Mon_Name, MAX(Mon_Time) Mon_Time
     ,    MAX(Tue_Name) Tue_Name, MAX(Tue_Time) Tue_Time
     ,    MAX(Wed_Name) Tue_Name, MAX(Wed_Time) Wed_Time
     ,    MAX(Thu_Name) Tue_Name, MAX(Thu_Time) Thu_Time
     ,    MAX(Fri_Name) Tue_Name, MAX(Fri_Time) Fri_Time
     ,    MAX(Sat_Name) Tue_Name, MAX(Sat_Time) Sat_Time
     ,    MAX(Sun_Name) Tue_Name, MAX(Sun_Time) Sun_Time
 FROM 
     (
         SELECT bg.BranchID, bg.WeekID, bg.RowPos, b.CustomerID, b.CustFNm, b.DayName + '_Name' as BookedName, b.StartTime, b.DayName + '_Time' as BookedTime
         FROM BookingGrid bg
         LEFT JOIN Booking b ON bg.BranchID = b.BranchID AND bg.WeekID = b.WeekID AND bg.RowPos = b.RowPos
     ) AS Source
     PIVOT (MIN(CustFNm) FOR BookedName IN(Mon_Name, Tue_Name, Wed_Name, Thu_Name, Fri_Name, Sat_Name, Sun_Name)) AS PivotBookingID
     PIVOT (MIN(StartTime) FOR BookedTime IN(Mon_Time, Tue_Time, Wed_Time, Thu_Time, Fri_Time, Sat_Time, Sun_Time)) AS PivotStartTime
 GROUP BY BranchID, WeekID, RowPos
 ORDER BY BranchID, WeekID, RowPos
more ▼

answered Dec 16, 2011 at 12:10 PM

avatar image

Alendar
40 1 2

Good luck on the job search, looks like you have the drive and capacity to learn/think on your feet. I would add SharePoint 2010 dev to your list. Skip the LINQ2SQL, master the T-SQL dev and use stored procs instead. LINQ2Enities is useful. I don't have an opinion on Entity Framework, I just avoid it. I've done a lot of web development in the past and it is too tedious to be enjoyable for me but I do like Silverlight. Written several Silverlight Line-of-Business apps that are exposed in SharePoint. Peace

Dec 16, 2011 at 03:57 PM Scot Hauder

Thanks, Scot! That's good advice. It's all new to me after SQL Server 2000, so I appreciate the direction. I hadn't even thought about Silverlight.

Dec 16, 2011 at 06:44 PM Alendar

Unless you are an awesome web developer, which will include strong photoshop and flash skills--just focus on the basics. In your case I say know basic ASP.NET and some silverlight but master the database side. Anyone can write some crappy, inefficient asp.net or c# and its performance will still be acceptable. The real strength comes in knowing set-based solutions instead of cursors or loops. And in the case of the solution above, CTE's are performance killers and, while easier to read, they are more performant expressed as derived/sub queries.

Dec 16, 2011 at 10:18 PM Scot Hauder

It seems like it would create a lot of duplicate code, and I'm not sure SQL Server would come up with a different plan. Temp tables would be my next guess if performance was an issue.

Dec 17, 2011 at 12:27 PM Alendar
(comments are locked)
10|1200 characters needed characters left

Right, i think ive found the best way to do this!

Ive created a new primary key to join on by putting the following in the WeeksBookings expression:

 --Start Snippet - Unique number per branch and weekday
 ROW_NUMBER() OVER ( PARTITION BY B.BranchID, DATEPART(WEEKDAY, B.BookingDate) ORDER BY B.BranchID ) AS JoinID ,
 --End Snippet

This gives me a unique number per weeknumber & branch, i then revised the BranchList expression as follows:

 BranchList ( BranchID, JoinID )
 AS ( SELECT   BranchID, JoinID
      FROM     WeeksBookings
      GROUP BY BranchID, JoinID
 )

I then joined the weeks on the weeknumber AND this new JoinID, this works perfectly:

 SET DATEFIRST 1 ;
 DECLARE @POS_NUMNBERS VARCHAR(1024) ;
 DECLARE @BranchID VARCHAR(3)
 SET @POS_NUMNBERS = 'RM,IT' ;

 WITH    WeeksBookings ( DET_NUMBERA, FirstName, LastName, ReasonName, ReasonColour, BranchID, JoinID, BookingDate, DayNumber, StartTime, EndTime )
       AS ( SELECT   B.DET_NUMBERA ,
                     RTRIM(C.DET_G1_NAME1A) ,
                     RTRIM(C.DET_SURNAMEA) ,
                     R.ReasonName ,
                     R.ReasonColour ,
                     B.BranchID ,
                     ROW_NUMBER() OVER ( PARTITION BY B.BranchID, DATEPART(WEEKDAY, B.BookingDate) ORDER BY B.BranchID ) AS JoinID ,
                     B.BookingDate ,
                     DATEPART(WEEKDAY, B.BookingDate) AS WeekDay ,
                     B.StartTime ,
                     B.EndTime
            FROM     tbl_Booking B
                     LEFT OUTER JOIN dbo.tbl_BookingReason R ON B.ReasonID = R.ReasonID
                     LEFT OUTER JOIN dbo.tbl_ChrisCache C ON B.DET_NUMBERA = C.DET_NUMBERA
            WHERE    CONVERT(VARCHAR(10), dbo.udf_WeekEndDate(BookingDate), 103) = '18/12/2011'
                     AND C.POS_NUMBERA IN ( SELECT   Data
                                            FROM     dbo.fnc_Split(@POS_NUMNBERS, ',') )
          ),
     BranchList ( BranchID, JoinID )
       AS ( SELECT   BranchID, JoinID
            FROM     WeeksBookings
            GROUP BY BranchID, JoinID
          ),
     Bookings_Mon ( Name, ReasonName, ReasonColour, BranchID, JoinID ,StartTime, EndTime )
       AS ( SELECT   LEFT(FirstName,1) + '.' + LastName AS Name ,
                     ReasonName ,
                     ReasonColour ,
                     BranchID ,
                     JoinID ,
                     StartTime ,
                     EndTime
            FROM     WeeksBookings
            WHERE    DayNumber = 1
          ),
     Bookings_Tue ( Name, ReasonName, ReasonColour, BranchID, JoinID ,StartTime, EndTime )
       AS ( SELECT   LEFT(FirstName,1) + '.' + LastName AS Name ,
                     ReasonName ,
                     ReasonColour ,
                     BranchID ,
                     JoinID ,
                     StartTime ,
                     EndTime
            FROM     WeeksBookings
            WHERE    DayNumber = 2
          ),
     Bookings_Wed ( Name, ReasonName, ReasonColour, BranchID, JoinID ,StartTime, EndTime )
       AS ( SELECT   LEFT(FirstName,1) + '.' + LastName AS Name ,
                     ReasonName ,
                     ReasonColour ,
                     BranchID ,
                     JoinID ,
                     StartTime ,
                     EndTime
            FROM     WeeksBookings
            WHERE    DayNumber = 3
          ),
     Bookings_Thu ( Name, ReasonName, ReasonColour, BranchID, JoinID ,StartTime, EndTime )
       AS ( SELECT   LEFT(FirstName,1) + '.' + LastName AS Name ,
                     ReasonName ,
                     ReasonColour ,
                     BranchID ,
                     JoinID ,
                     StartTime ,
                     EndTime
            FROM     WeeksBookings
            WHERE    DayNumber = 4
          ),
     Bookings_Fri ( Name, ReasonName, ReasonColour, BranchID, JoinID ,StartTime, EndTime )
       AS ( SELECT   LEFT(FirstName,1) + '.' + LastName AS Name ,
                     ReasonName ,
                     ReasonColour ,
                     BranchID ,
                     JoinID ,
                     StartTime ,
                     EndTime
            FROM     WeeksBookings
            WHERE    DayNumber = 5
          ),
     Bookings_Sat ( Name, ReasonName, ReasonColour, BranchID, JoinID ,StartTime, EndTime )
       AS ( SELECT   LEFT(FirstName,1) + '.' + LastName AS Name ,
                     ReasonName ,
                     ReasonColour ,
                     BranchID ,
                     JoinID ,
                     StartTime ,
                     EndTime
            FROM     WeeksBookings
            WHERE    DayNumber = 6
          ),
     Bookings_Sun ( Name, ReasonName, ReasonColour, BranchID, JoinID ,StartTime, EndTime )
       AS ( SELECT   LEFT(FirstName,1) + '.' + LastName AS Name ,
                     ReasonName ,
                     ReasonColour ,
                     BranchID ,
                     JoinID ,
                     StartTime ,
                     EndTime
            FROM     WeeksBookings
            WHERE    DayNumber = 7
          )
 SELECT  BranchList.BranchID ,
         BranchList.JoinID ,
         Bookings_Mon.Name AS Mon_Name,
         Bookings_Mon.ReasonName AS Mon_Reason,
         Bookings_Mon.ReasonColour AS Mon_ReasonCol,
         Bookings_Mon.StartTime AS Mon_Start,
         Bookings_Mon.EndTime AS Mon_End,
         
         Bookings_Tue.Name AS Tue_Name,
         Bookings_Tue.ReasonName AS Tue_Reason,
         Bookings_Tue.ReasonColour AS Tue_ReasonCol,
         Bookings_Tue.StartTime AS Tue_Start,
         Bookings_Tue.EndTime AS Tue_End,
         
         Bookings_Wed.Name AS Wed_Name,
         Bookings_Wed.ReasonName AS Wed_Reason,
         Bookings_Wed.ReasonColour AS Wed_ReasonCol,
         Bookings_Wed.StartTime AS Wed_Start,
         Bookings_Wed.EndTime AS Wed_End,
         
         Bookings_Thu.Name AS Thu_Name,
         Bookings_Thu.ReasonName AS Thu_Reason,
         Bookings_Thu.ReasonColour AS Thu_ReasonCol,
         Bookings_Thu.StartTime AS Thu_Start,
         Bookings_Thu.EndTime AS Thu_End,
         
         Bookings_Fri.Name AS Fri_Name,
         Bookings_Fri.ReasonName AS Fri_Reason,
         Bookings_Fri.ReasonColour AS Fri_ReasonCol,
         Bookings_Fri.StartTime AS Fri_Start,
         Bookings_Fri.EndTime AS Fri_End,
         
         Bookings_Sat.Name AS Sat_Name,
         Bookings_Sat.ReasonName AS Sat_Reason,
         Bookings_Sat.ReasonColour AS Sat_ReasonCol,
         Bookings_Sat.StartTime AS Sat_Start,
         Bookings_Sat.EndTime AS Sat_End,
         
         Bookings_Sun.Name AS Sun_Name,
         Bookings_Sun.ReasonName AS Sun_Reason,
         Bookings_Sun.ReasonColour AS Sun_ReasonCol,
         Bookings_Sun.StartTime AS Sun_Start,
         Bookings_Sun.EndTime AS Sun_End
 FROM    BranchList
         LEFT OUTER JOIN Bookings_Mon ON BranchList.BranchID = Bookings_Mon.BranchID AND BranchList.JoinID = Bookings_Mon.JoinID
         LEFT OUTER JOIN Bookings_Tue ON BranchList.BranchID = Bookings_Tue.BranchID AND BranchList.JoinID = Bookings_Tue.JoinID
         LEFT OUTER JOIN Bookings_Wed ON BranchList.BranchID = Bookings_Wed.BranchID AND BranchList.JoinID = Bookings_Wed.JoinID
         LEFT OUTER JOIN Bookings_Thu ON BranchList.BranchID = Bookings_Thu.BranchID AND BranchList.JoinID = Bookings_Thu.JoinID
         LEFT OUTER JOIN Bookings_Fri ON BranchList.BranchID = Bookings_Fri.BranchID AND BranchList.JoinID = Bookings_Fri.JoinID
         LEFT OUTER JOIN Bookings_Sat ON BranchList.BranchID = Bookings_Sat.BranchID AND BranchList.JoinID = Bookings_Sat.JoinID
         LEFT OUTER JOIN Bookings_Sun ON BranchList.BranchID = Bookings_Sun.BranchID AND BranchList.JoinID = Bookings_Sun.JoinID
 ORDER BY BranchList.BranchID
more ▼

answered Dec 16, 2011 at 10:01 AM

avatar image

HeavenCore
89 4 7 9

@HeavenCore I do not think this is the best solution. I cannot find any good reason to make several joins when a single join can do the job. Anyways, at the end it is your decision which matters. :)

Dec 18, 2011 at 10:38 PM Usman Butt

@Usman Butt Hello there, whilst your CASE method offers a slightly faster execution, it also results in massive amounts of rows, for instance, say there was 20 rows per day in the WeeksBookings expression, your method would get a 140 row result with all but one set of the days being NULL. Whereas my method returns a 20 row result set. I agree mine is slower, but won’t require a second phase to group up the data. The DDL I provided in my opening question + the revised CTE in this answer vs your case solution demonstrates this issue. I hope that makes sense :)

Dec 19, 2011 at 01:44 AM HeavenCore

@HeavenCore Sorry, but I cannot follow the external link for some reasons. If you can put some dummy data (which would return 20 rows as you stated for your code), my gut feeling is that can still be achieved with two joins or may be in one query.

Dec 20, 2011 at 01:29 AM Usman Butt
(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.

Topics:

x1069

asked: Dec 16, 2011 at 05:05 AM

Seen: 1472 times

Last Updated: Dec 16, 2011 at 05:12 AM

Copyright 2016 Redgate Software. Privacy Policy