x

A rather odd join conundrum

A rather odd join conundrum

Either i have not had enough coffee this morning, or i have ran into a bit of strange join scenario.

I have a CTE that's parsing millions of rows into a timesheet for a given weekending.

This timesheet is going to be rendered in web pages, emails, binaries etc, and rather than writing rendering code for all the various interfaces we'de like to fetch the data from SQL in semi-timesheet esque form (i.e. a branch/mon/tue/wed columns etc):

Branch | Mon_Name | Mon_Time | Tue_Name | Tue_Time
--------------------------------------------------
001      Bob        09-12      Bob        09-10
001      NULL       NULL       Jim        10-12
002      James      08-13      Terry      08-13
002      Mary       03-12      NULL       NULL

The problem here is that every day may have a varibale number of bookings (Note the NULL's), i.e. on tueday, branch 1 might have 2 bookings (2 rows) but only 1 booking on the monday.

a stripped down version of my CTE is as follows:

WITH    WeeksBookings ( DET_NUMBERA, FirstName, LastName, ReasonName, ReasonColour, BranchID, BookingDate, DayNumber, StartTime, EndTime )
      AS ( SELECT   B.DET_NUMBERA ,
                    RTRIM(C.DET_G1_NAME1A) ,
                    RTRIM(C.DET_SURNAMEA) ,
                    R.ReasonName ,
                    R.ReasonColour ,
                    B.BranchID ,
                    B.BookingDate ,
                    DATEPART(WEEKDAY, B.BookingDate) AS WeekDay ,
                    B.StartTime ,
                    B.EndTime
           FROM     tbl_Booking B
                    LEFT OUTER JOIN dbo.tbl_BookingReason R ON B.ReasonID = R.ReasonID
                    LEFT OUTER JOIN dbo.tbl_ChrisCache C ON B.DET_NUMBERA = C.DET_NUMBERA
           WHERE    CONVERT(VARCHAR(10), dbo.udf_WeekEndDate(BookingDate), 103) = '18/12/2011'
         ),
    BranchList ( BranchID )
      AS ( SELECT   BranchID
           FROM     WeeksBookings
           GROUP BY BranchID
         ),
    Bookings_Mon ( Name, ReasonName, ReasonColour, BranchID, StartTime, EndTime )
      AS ( SELECT   LEFT(FirstName,1) + '.' + LastName AS Name ,
                    ReasonName ,
                    ReasonColour ,
                    BranchID ,
                    StartTime ,
                    EndTime
           FROM     WeeksBookings
           WHERE    DayNumber = 1
         ),
    Bookings_Tue ( Name, ReasonName, ReasonColour, BranchID, StartTime, EndTime )
      AS ( SELECT   LEFT(FirstName,1) + '.' + LastName AS Name ,
                    ReasonName ,
                    ReasonColour ,
                    BranchID ,
                    StartTime ,
                    EndTime
           FROM     WeeksBookings
           WHERE    DayNumber = 2
         )
SELECT  BranchList.BranchID ,
        Bookings_Mon.Name AS Mon_Name,
        Bookings_Mon.ReasonName AS Mon_Reason,
        Bookings_Mon.ReasonColour AS Mon_ReasonCol,
        Bookings_Mon.StartTime AS Mon_Start,
        Bookings_Mon.EndTime AS Mon_End,            
        Bookings_Tue.Name AS Tue_Name,
        Bookings_Tue.ReasonName AS Tue_Reason,
        Bookings_Tue.ReasonColour AS Tue_ReasonCol,
        Bookings_Tue.StartTime AS Tue_Start,
        Bookings_Tue.EndTime AS Tue_End
FROM    BranchList
        LEFT OUTER JOIN Bookings_Mon ON BranchList.BranchID = Bookings_Mon.BranchID
        LEFT OUTER JOIN Bookings_Tue ON BranchList.BranchID = Bookings_Tue.BranchID
ORDER BY BranchList.BranchID

In the above CTE, BranchList contains 2 rows "001" & "002", BookingsMon 60 rows, BookingsTue 57 rows. By joining just BranchList & Bookings_Mon i get a 60 row result set with the branchID column correctly duplicated to reflect the greater number of matching rows on the join. However, as soon as i join in tuesday, things go mental as its obviously joining the 57 rows in tuesday onto the now 60 rows resulting in 3420 rows...

Is there any way to do this kinda of side-by-side join in set based SQL, or am i going to have to go recursive? (temp tables, loops yada yada)

Thoughts welcome.
more ▼

asked Dec 16, 2011 at 05:05 AM in Default

HeavenCore gravatar image

HeavenCore
69 4 4 6

can you provide table DDL and some sample data please?
Dec 16, 2011 at 05:11 AM Fatherjack ♦♦
@Fatherjack Sure can, might be a bit chunky, what is the best way to post it? (You’ve advised me not to post external links it the past)
Dec 16, 2011 at 05:28 AM HeavenCore
we would only need a few rows. enough to show how you want things summarised. as for the DDL, thats a bit trickier if the objects are big
Dec 16, 2011 at 07:39 AM Fatherjack ♦♦
@Fatherjack Ive created a DDL & sample data dump, you can grab it here: http://www.heavencore.co.uk/filehub/uploaded/SchemaAndSampleData.sql apologies for the link (its my website) but the sample data is really needed to demonstrate the problem. I'm still looking for a good solution, as Usmans solution below results in massive amounts of empty cells.
Dec 16, 2011 at 08:19 AM HeavenCore
(comments are locked)
10|1200 characters needed characters left

3 answers: sort voted first

If I were you, I would use CASE statement to get the desired output. No need of joins. If you would have provided the sample data and DDLs, I could have been more certain. You should do something like this

WITH    WeeksBookings ( DET_NUMBERA, FirstName, LastName, ReasonName, ReasonColour, BranchID, BookingDate, DayNumber, StartTime, EndTime )
      AS ( SELECT   B.DET_NUMBERA ,
                    RTRIM(C.DET_G1_NAME1A) ,
                    RTRIM(C.DET_SURNAMEA) ,
                    R.ReasonName ,
                    R.ReasonColour ,
                    B.BranchID ,
                    B.BookingDate ,
                    DATEPART(WEEKDAY, B.BookingDate) AS WeekDay ,
                    B.StartTime ,
                    B.EndTime
           FROM     tbl_Booking B
                    LEFT OUTER JOIN dbo.tbl_BookingReason R ON B.ReasonID = R.ReasonID
                    LEFT OUTER JOIN dbo.tbl_ChrisCache C ON B.DET_NUMBERA = C.DET_NUMBERA
           WHERE    CONVERT(VARCHAR(10), dbo.udf_WeekEndDate(BookingDate), 103) = '18/12/2011'
         ),
    BranchList ( BranchID )
      AS ( SELECT   BranchID
           FROM     WeeksBookings
           GROUP BY BranchID
         )
SELECT  BranchList.BranchID ,
        CASE WHEN DayNumber = 1 THEN LEFT(FirstName,1) + '.' + LastName ELSE '' END AS Mon_Name,
        CASE WHEN DayNumber = 1 THEN ReasonName ELSE '' END AS Mon_Reason,
        CASE WHEN DayNumber = 1 THEN ReasonColour  ELSE '' END AS Mon_ReasonCol,
        CASE WHEN DayNumber = 1 THEN StartTime ELSE NULL END AS Mon_Start,
        CASE WHEN DayNumber = 1 THEN EndTime ELSE NULL END AS Mon_End,            
        CASE WHEN DayNumber = 2 THEN LEFT(FirstName,1) + '.' + LastName ELSE '' END AS Tue_Name,
        CASE WHEN DayNumber = 2 THEN ReasonName ELSE '' END AS Tue_Reason,
        CASE WHEN DayNumber = 2 THEN ReasonColour  ELSE '' END AS Tue_ReasonCol,
        CASE WHEN DayNumber = 2 THEN StartTime ELSE NULL END AS Tue_Start,
        CASE WHEN DayNumber = 2 THEN EndTime ELSE NULL END AS Tue_End
FROM    BranchList
        LEFT OUTER JOIN WeeksBookings ON BranchList.BranchID = WeeksBookings.BranchID
ORDER BY BranchList.BranchID
Hope it helps.
more ▼

answered Dec 16, 2011 at 05:51 AM

Usman Butt gravatar image

Usman Butt
13.9k 6 8 14

I was working on some sample data (I’m working on a dev copy of live data, so had to do some serious anonymizing (sp?) and stripping of sensitive data, your solution beat me to it! But as for your solution, it works perfectly! Well done!
Dec 16, 2011 at 05:57 AM HeavenCore
(comments are locked)
10|1200 characters needed characters left

Here's my attempt. I used PIVOT

WITH Branch AS (SELECT BranchID FROM tbl_Booking GROUP BY BranchID)
, Customer AS (SELECT DET_NUMBERA CustomerID, DET_G1_NAME1A CustFNm, DET_SURNAMEA CustLNm FROM tbl_ChrisCache)
, Booking AS 
    (
       SELECT BranchID, WeekID, BookingID, c.CustomerID, c.CustFNm, DayID, DayName, StartTime, RowPos
       , MAX(RowPos) OVER(PARTITION BY BranchID, WeekID, DayID)  LastRowPos
       , MIN(RowPos) OVER(PARTITION BY BranchID, WeekID, DayID)  FirstRowPos
       FROM
       (
         SELECT BranchID, WeekID, BookingID, CustomerID, BookingDate, DayID, DayName, StartTime
         ,    ROW_NUMBER() OVER(PARTITION BY BranchID, WeekID, DayID ORDER BY CustomerID, StartTime) RowPos
         FROM
         (
         SELECT BranchID, BookingID, DET_NUMBERA CustomerID, BookingDate, 
          DATEPART(WEEKDAY, BookingDate) DayID, LEFT(DATENAME(WEEKDAY, BookingDate), 3) DayName
         ,    StartTime, DATEPART(WEEK, BookingDate) WeekID
         FROM tbl_Booking b
         ) base
       ) aggs
        JOIN Customer c ON aggs.CustomerID = c.CustomerID
    )
, BookingGridFirstPos AS
    (
       SELECT DISTINCT BranchID, WeekID, 1 as RowPos FROM Booking
    )
, BookingGridLastPos AS
    (
       SELECT BranchID, WeekID, MAX(RowPos) LastRowPos FROM Booking
       GROUP BY BranchID, WeekID
    )  
, BookingGrid AS
    (
       SELECT DISTINCT i.BranchID, i.WeekID, i.RowPos
       FROM Booking i JOIN BookingGridLastPos il on i.BranchID = il.BranchID and i.WeekID = il.WeekID and i.LastRowPos = il.LastRowPos
    )

SELECT BranchID, WeekID, RowPos, 
       MAX(Mon_Name) Mon_Name, MAX(Mon_Time) Mon_Time
    ,  MAX(Tue_Name) Tue_Name, MAX(Tue_Time) Tue_Time
    ,  MAX(Wed_Name) Tue_Name, MAX(Wed_Time) Wed_Time
    ,  MAX(Thu_Name) Tue_Name, MAX(Thu_Time) Thu_Time
    ,  MAX(Fri_Name) Tue_Name, MAX(Fri_Time) Fri_Time
    ,  MAX(Sat_Name) Tue_Name, MAX(Sat_Time) Sat_Time
    ,  MAX(Sun_Name) Tue_Name, MAX(Sun_Time) Sun_Time
FROM 
    (
       SELECT bg.BranchID, bg.WeekID, bg.RowPos, b.CustomerID, b.CustFNm, b.DayName + '_Name' as BookedName, b.StartTime, b.DayName + '_Time' as BookedTime
       FROM BookingGrid bg
       LEFT JOIN Booking b ON bg.BranchID = b.BranchID AND bg.WeekID = b.WeekID AND bg.RowPos = b.RowPos
    ) AS Source
    PIVOT (MIN(CustFNm) FOR BookedName IN(Mon_Name, Tue_Name, Wed_Name, Thu_Name, Fri_Name, Sat_Name, Sun_Name)) AS PivotBookingID
    PIVOT (MIN(StartTime) FOR BookedTime IN(Mon_Time, Tue_Time, Wed_Time, Thu_Time, Fri_Time, Sat_Time, Sun_Time)) AS PivotStartTime
GROUP BY BranchID, WeekID, RowPos
ORDER BY BranchID, WeekID, RowPos
more ▼

answered Dec 16, 2011 at 12:10 PM

Alendar gravatar image

Alendar
40 2

Good luck on the job search, looks like you have the drive and capacity to learn/think on your feet. I would add SharePoint 2010 dev to your list. Skip the LINQ2SQL, master the T-SQL dev and use stored procs instead. LINQ2Enities is useful. I don't have an opinion on Entity Framework, I just avoid it. I've done a lot of web development in the past and it is too tedious to be enjoyable for me but I do like Silverlight. Written several Silverlight Line-of-Business apps that are exposed in SharePoint. Peace
Dec 16, 2011 at 03:57 PM Scot Hauder
Thanks, Scot! That's good advice. It's all new to me after SQL Server 2000, so I appreciate the direction. I hadn't even thought about Silverlight.
Dec 16, 2011 at 06:44 PM Alendar
Unless you are an awesome web developer, which will include strong photoshop and flash skills--just focus on the basics. In your case I say know basic ASP.NET and some silverlight but master the database side. Anyone can write some crappy, inefficient asp.net or c# and its performance will still be acceptable. The real strength comes in knowing set-based solutions instead of cursors or loops. And in the case of the solution above, CTE's are performance killers and, while easier to read, they are more performant expressed as derived/sub queries.
Dec 16, 2011 at 10:18 PM Scot Hauder
It seems like it would create a lot of duplicate code, and I'm not sure SQL Server would come up with a different plan. Temp tables would be my next guess if performance was an issue.
Dec 17, 2011 at 12:27 PM Alendar
(comments are locked)
10|1200 characters needed characters left

Right, i think ive found the best way to do this!

Ive created a new primary key to join on by putting the following in the WeeksBookings expression:

--Start Snippet - Unique number per branch and weekday
ROW_NUMBER() OVER ( PARTITION BY B.BranchID, DATEPART(WEEKDAY, B.BookingDate) ORDER BY B.BranchID ) AS JoinID ,
--End Snippet

This gives me a unique number per weeknumber & branch, i then revised the BranchList expression as follows:

BranchList ( BranchID, JoinID )
AS ( SELECT   BranchID, JoinID
     FROM     WeeksBookings
     GROUP BY BranchID, JoinID
)

I then joined the weeks on the weeknumber AND this new JoinID, this works perfectly:

SET DATEFIRST 1 ;
DECLARE @POS_NUMNBERS VARCHAR(1024) ;
DECLARE @BranchID VARCHAR(3)
SET @POS_NUMNBERS = 'RM,IT' ;

WITH    WeeksBookings ( DET_NUMBERA, FirstName, LastName, ReasonName, ReasonColour, BranchID, JoinID, BookingDate, DayNumber, StartTime, EndTime )
      AS ( SELECT   B.DET_NUMBERA ,
                    RTRIM(C.DET_G1_NAME1A) ,
                    RTRIM(C.DET_SURNAMEA) ,
                    R.ReasonName ,
                    R.ReasonColour ,
                    B.BranchID ,
                    ROW_NUMBER() OVER ( PARTITION BY B.BranchID, DATEPART(WEEKDAY, B.BookingDate) ORDER BY B.BranchID ) AS JoinID ,
                    B.BookingDate ,
                    DATEPART(WEEKDAY, B.BookingDate) AS WeekDay ,
                    B.StartTime ,
                    B.EndTime
           FROM     tbl_Booking B
                    LEFT OUTER JOIN dbo.tbl_BookingReason R ON B.ReasonID = R.ReasonID
                    LEFT OUTER JOIN dbo.tbl_ChrisCache C ON B.DET_NUMBERA = C.DET_NUMBERA
           WHERE    CONVERT(VARCHAR(10), dbo.udf_WeekEndDate(BookingDate), 103) = '18/12/2011'
                    AND C.POS_NUMBERA IN ( SELECT   Data
                                           FROM     dbo.fnc_Split(@POS_NUMNBERS, ',') )
         ),
    BranchList ( BranchID, JoinID )
      AS ( SELECT   BranchID, JoinID
           FROM     WeeksBookings
           GROUP BY BranchID, JoinID
         ),
    Bookings_Mon ( Name, ReasonName, ReasonColour, BranchID, JoinID ,StartTime, EndTime )
      AS ( SELECT   LEFT(FirstName,1) + '.' + LastName AS Name ,
                    ReasonName ,
                    ReasonColour ,
                    BranchID ,
                    JoinID ,
                    StartTime ,
                    EndTime
           FROM     WeeksBookings
           WHERE    DayNumber = 1
         ),
    Bookings_Tue ( Name, ReasonName, ReasonColour, BranchID, JoinID ,StartTime, EndTime )
      AS ( SELECT   LEFT(FirstName,1) + '.' + LastName AS Name ,
                    ReasonName ,
                    ReasonColour ,
                    BranchID ,
                    JoinID ,
                    StartTime ,
                    EndTime
           FROM     WeeksBookings
           WHERE    DayNumber = 2
         ),
    Bookings_Wed ( Name, ReasonName, ReasonColour, BranchID, JoinID ,StartTime, EndTime )
      AS ( SELECT   LEFT(FirstName,1) + '.' + LastName AS Name ,
                    ReasonName ,
                    ReasonColour ,
                    BranchID ,
                    JoinID ,
                    StartTime ,
                    EndTime
           FROM     WeeksBookings
           WHERE    DayNumber = 3
         ),
    Bookings_Thu ( Name, ReasonName, ReasonColour, BranchID, JoinID ,StartTime, EndTime )
      AS ( SELECT   LEFT(FirstName,1) + '.' + LastName AS Name ,
                    ReasonName ,
                    ReasonColour ,
                    BranchID ,
                    JoinID ,
                    StartTime ,
                    EndTime
           FROM     WeeksBookings
           WHERE    DayNumber = 4
         ),
    Bookings_Fri ( Name, ReasonName, ReasonColour, BranchID, JoinID ,StartTime, EndTime )
      AS ( SELECT   LEFT(FirstName,1) + '.' + LastName AS Name ,
                    ReasonName ,
                    ReasonColour ,
                    BranchID ,
                    JoinID ,
                    StartTime ,
                    EndTime
           FROM     WeeksBookings
           WHERE    DayNumber = 5
         ),
    Bookings_Sat ( Name, ReasonName, ReasonColour, BranchID, JoinID ,StartTime, EndTime )
      AS ( SELECT   LEFT(FirstName,1) + '.' + LastName AS Name ,
                    ReasonName ,
                    ReasonColour ,
                    BranchID ,
                    JoinID ,
                    StartTime ,
                    EndTime
           FROM     WeeksBookings
           WHERE    DayNumber = 6
         ),
    Bookings_Sun ( Name, ReasonName, ReasonColour, BranchID, JoinID ,StartTime, EndTime )
      AS ( SELECT   LEFT(FirstName,1) + '.' + LastName AS Name ,
                    ReasonName ,
                    ReasonColour ,
                    BranchID ,
                    JoinID ,
                    StartTime ,
                    EndTime
           FROM     WeeksBookings
           WHERE    DayNumber = 7
         )
SELECT  BranchList.BranchID ,
       BranchList.JoinID ,
        Bookings_Mon.Name AS Mon_Name,
        Bookings_Mon.ReasonName AS Mon_Reason,
        Bookings_Mon.ReasonColour AS Mon_ReasonCol,
        Bookings_Mon.StartTime AS Mon_Start,
        Bookings_Mon.EndTime AS Mon_End,

        Bookings_Tue.Name AS Tue_Name,
        Bookings_Tue.ReasonName AS Tue_Reason,
        Bookings_Tue.ReasonColour AS Tue_ReasonCol,
        Bookings_Tue.StartTime AS Tue_Start,
        Bookings_Tue.EndTime AS Tue_End,

        Bookings_Wed.Name AS Wed_Name,
        Bookings_Wed.ReasonName AS Wed_Reason,
        Bookings_Wed.ReasonColour AS Wed_ReasonCol,
        Bookings_Wed.StartTime AS Wed_Start,
        Bookings_Wed.EndTime AS Wed_End,

        Bookings_Thu.Name AS Thu_Name,
        Bookings_Thu.ReasonName AS Thu_Reason,
        Bookings_Thu.ReasonColour AS Thu_ReasonCol,
        Bookings_Thu.StartTime AS Thu_Start,
        Bookings_Thu.EndTime AS Thu_End,

        Bookings_Fri.Name AS Fri_Name,
        Bookings_Fri.ReasonName AS Fri_Reason,
        Bookings_Fri.ReasonColour AS Fri_ReasonCol,
        Bookings_Fri.StartTime AS Fri_Start,
        Bookings_Fri.EndTime AS Fri_End,

        Bookings_Sat.Name AS Sat_Name,
        Bookings_Sat.ReasonName AS Sat_Reason,
        Bookings_Sat.ReasonColour AS Sat_ReasonCol,
        Bookings_Sat.StartTime AS Sat_Start,
        Bookings_Sat.EndTime AS Sat_End,

        Bookings_Sun.Name AS Sun_Name,
        Bookings_Sun.ReasonName AS Sun_Reason,
        Bookings_Sun.ReasonColour AS Sun_ReasonCol,
        Bookings_Sun.StartTime AS Sun_Start,
        Bookings_Sun.EndTime AS Sun_End
FROM    BranchList
        LEFT OUTER JOIN Bookings_Mon ON BranchList.BranchID = Bookings_Mon.BranchID AND BranchList.JoinID = Bookings_Mon.JoinID
        LEFT OUTER JOIN Bookings_Tue ON BranchList.BranchID = Bookings_Tue.BranchID AND BranchList.JoinID = Bookings_Tue.JoinID
        LEFT OUTER JOIN Bookings_Wed ON BranchList.BranchID = Bookings_Wed.BranchID AND BranchList.JoinID = Bookings_Wed.JoinID
        LEFT OUTER JOIN Bookings_Thu ON BranchList.BranchID = Bookings_Thu.BranchID AND BranchList.JoinID = Bookings_Thu.JoinID
        LEFT OUTER JOIN Bookings_Fri ON BranchList.BranchID = Bookings_Fri.BranchID AND BranchList.JoinID = Bookings_Fri.JoinID
        LEFT OUTER JOIN Bookings_Sat ON BranchList.BranchID = Bookings_Sat.BranchID AND BranchList.JoinID = Bookings_Sat.JoinID
        LEFT OUTER JOIN Bookings_Sun ON BranchList.BranchID = Bookings_Sun.BranchID AND BranchList.JoinID = Bookings_Sun.JoinID
ORDER BY BranchList.BranchID
more ▼

answered Dec 16, 2011 at 10:01 AM

HeavenCore gravatar image

HeavenCore
69 4 4 6

@HeavenCore I do not think this is the best solution. I cannot find any good reason to make several joins when a single join can do the job. Anyways, at the end it is your decision which matters. :)
Dec 18, 2011 at 10:38 PM Usman Butt
@Usman Butt Hello there, whilst your CASE method offers a slightly faster execution, it also results in massive amounts of rows, for instance, say there was 20 rows per day in the WeeksBookings expression, your method would get a 140 row result with all but one set of the days being NULL. Whereas my method returns a 20 row result set. I agree mine is slower, but won’t require a second phase to group up the data. The DDL I provided in my opening question + the revised CTE in this answer vs your case solution demonstrates this issue. I hope that makes sense :)
Dec 19, 2011 at 01:44 AM HeavenCore
@HeavenCore Sorry, but I cannot follow the external link for some reasons. If you can put some dummy data (which would return 20 rows as you stated for your code), my gut feeling is that can still be achieved with two joins or may be in one query.
Dec 20, 2011 at 01:29 AM Usman Butt
(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

New code box

There's a new way to format code on the site - the red speech bubble logo will automatically format T-SQL for you. The original code box is still there for XML, etc. More details here.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.

Topics:

x987

asked: Dec 16, 2011 at 05:05 AM

Seen: 1256 times

Last Updated: Dec 16, 2011 at 05:12 AM