question

terrancefisher avatar image
terrancefisher asked

Complex join

How do I perform a join on the following table? Table Name: Accounts Number of Rows: 20,000,000 (20 Million and growing) SELECT s.StoreNum, a.sum(YTD_Sales) as 'PLSales' FROM Accounts a LEFT JOIN Store s on s.StoreNum = a.StoreNum **WHERE a.AcctKey >= 80000 and a.AcctKey <= 85000** AND THIS ONE SELECT s.StoreNum, sum(a.YTD_Sales) as 'PLShrink' FROM Accounts a LEFT JOIN Store s on s.StoreNum = a.StoreNum **WHERE a.AcctKey >= 95000 and a.AcctKey <= 99999** The only difference in the two queries is the WHERE clause. I need to return: StoreNum PLSales PLShrink %of Sales 123 -895.25 1245.36 PLSales/PLShrink 567 -75.25 855.36 PLSales/PLShrink What I've tried: Select s.StoreNum, sum(a.YTD_Sales) as 'PLSales', sum(b.YTD_Sales) as 'PLShrink' from Store s LEFT JOIN Accounts a on a.storenum=s.storenum LEFT JOIN Accounts b on b.storenum=s.storenum I also tried a temp table: Where I put all the store numbers that match accounts with the same storenum. create table #storeNumsThatMatch ( StoreNum VARCHAR(1000), ) INSERT INTO #storeNumsThatMatch(StoreNum) SELECT a.storeNum FROM Accounts a WHERE (**a.AcctKey >= 80000 and a.AcctKey <= 85000**) OR (**a.AcctKey >= 95000 and a.AcctKey <= 99999**) Query takes forever to run. Also how do I divide the two columns (sales/shrink)?
sql-server-2008
1 comment
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

JohnM avatar image JohnM commented ·
So my first thought, what indexes are on the table? Do you have an index for the AcctKey? My second thought is can you post the execution plan so we can see what the optimizer is doing?
0 Likes 0 ·
busynovadad avatar image
busynovadad answered
Here's one way to do it: IF OBJECT_ID('myaccounts') IS NOT NULL DROP TABLE myaccounts; IF OBJECT_ID('mystores') IS NOT NULL DROP TABLE mystores; SELECT TOP 2000000 IDENTITY( INT,1,1 ) AS myRowID , (SELECT ABS(CAST(NEWID() AS binary(6)) %1000) + 1 ) as StoreNum , (SELECT ABS(CAST(NEWID() AS binary(6)) %100000) + 1 ) as YTD_Sales , (SELECT ABS(CAST(NEWID() AS binary(6)) %100000) + 1 ) as AcctKey into myaccounts FROM Master.dbo.SysColumns sc1 , Master.dbo.SysColumns sc2 CREATE INDEX ix_myaccounts_storenum ON myaccounts(storenum); select distinct StoreNum into mystores from myaccounts; CREATE INDEX ix_mystores_storenum ON mystores(storenum); -- First, data check to make sure this script will actually SOLVE the problem --select top 1000 * from myaccounts; --select count(1) as NumOfStores from mystores; --select count(1) as NumOfSalesAccts from myaccounts where AcctKey between 80000 and 85000; --select count(1) as NumOfShrinkAccts from myaccounts where AcctKey between 95000 and 99999; -- Now, weave the two together select j.storenum , j.plsales , k.plshrink , cast(j.plsales as float)/ cast(k.plshrink as float) as 'Math' from ( SELECT s.StoreNum, sum(a.YTD_Sales) as 'PLSales' FROM mystores as s left join myaccounts as a on s.StoreNum = a.StoreNum WHERE a.AcctKey >= 80000 and a.AcctKey <= 85000 group by s.StoreNum ) j inner join ( SELECT s.StoreNum, sum(b.YTD_Sales) as 'PLShrink' FROM mystores as s left JOIN myaccounts as b on s.storenum = b.storenum WHERE b.AcctKey >= 95000 and b.AcctKey <= 99999 group by s.StoreNum ) k on j.storenum = k.storenum Which will return results like > storenum plsales plshrink Math > 13 4248444 4966953 0.855342098062937 > 14 5486413 5222513 1.05053122893136 > 15 4893860 6183803 0.791399726026201 > 16 5362690 5134079 1.04452814224323 > 30 5797214 4729980 1.22563182085337 > 31 5377488 4542752 1.18375117109629 > 32 4731900 5933874 0.797438570485319 > 33 4106106 4938954 0.831371581917953
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

terrancefisher avatar image
terrancefisher answered
Implementing it now... Will post results. With your data, it works like a charm, Thanks. My data, on the other hand, is a different story. Apparently, there may or may not be matching stores for Sales or Shrink. Based on the AcctKey, there may or may not be values for a store either in Sales or Shrink. Store ID Sales Shrink 123 -45.45 null 456 -15.05 4.15 785 null 482.25 Would an outer join work to bring back all rows? Do I need a 'case' to check for blank/null values to avoid 'divide by zero'?
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

busynovadad avatar image
busynovadad answered
Ok, the only issue I see with the results below is knowing how you want to handle the situation where a store potentially has shrinkage, but no sales. This, by definition, isn't expected to be possible, as, if the store is stocking the shelves, they are expected make at least one sale, which would produce a non-zero value in PLSales. (See row 1860 in the included result set, below) Without further ado, to get results like this set: > storenum PLSales PLShrink Math > 1857 0 0 0 > 1858 18277176 32258112 0.566591621977132 > 1859 17222680 22926976 0.751197192337969 > 1860 0 16696240 0 1861 27690824 0 0 > 1862 18616950 17516664 1.06281367274043 > 1863 237744 7766096 0.0306130647882797 > 1864 21891400 7721400 2.83515942704691 > 1865 13095260 12857204 1.01851537861576 > 1866 33942744 11993850 2.83001238134544 > 1867 0 19102116 0 1868 428869 0 0 > 1869 0 12052242 0 Here's the SQL code, freshly revised: --Make and fill some sample tables IF OBJECT_ID('myaccounts') IS NOT NULL DROP TABLE myaccounts; IF OBJECT_ID('mystores') IS NOT NULL DROP TABLE mystores; SELECT TOP 2000000 IDENTITY( INT,1,1 ) AS myRowID , (SELECT ABS(CAST(NEWID() AS binary(6)) %100000) + 1 ) as StoreNum , (SELECT ABS(CAST(NEWID() AS binary(6)) %100000) + 1 ) as YTD_Sales , (SELECT ABS(CAST(NEWID() AS binary(6)) %100000) + 1 ) as AcctKey into myaccounts FROM MASTER.DBO.SYSCOLUMNS SC1 , MASTER.DBO.SYSCOLUMNS SC2 CREATE INDEX ix_myaccounts_storenum ON myaccounts(storenum); -- I **GUARANTEE** this will produce a non-matching set -- SELECT TOP 20,000,000 using a range of 100,000 SELECT TOP 20000000 IDENTITY( INT,1,1 ) AS myRowID , (SELECT ABS(CAST(NEWID() AS binary(6)) %100000) + 1 ) as StoreNum into mystores FROM MASTER.DBO.SYSCOLUMNS SC1 , MASTER.DBO.SYSCOLUMNS SC2; CREATE INDEX ix_mystores_storenum ON mystores(storenum); -- First, data check to make sure this script will actually SOLVE the problem --select top 1000 * from myaccounts; --select top 1000 * from mystores; --select count(1) as Accts_Counted from myaccounts; --select count(1) as Stores_Counted from mystores; -- What's the intersection of the two? -- That is, do we have stores that don't exist? --select count(1) as Found_Stores from ( --select storenum from myaccounts --intersect --select storenum from mystores --) j --select count(1) as Missing_Stores from ( --select storenum from mystores --except --select storenum from myaccounts --) j --select count(1) as NumOfStores from mystores; --select count(1) as NumOfSalesAccts from myaccounts where AcctKey between 80000 and 85000; --select count(1) as NumOfShrinkAccts from myaccounts where AcctKey between 95000 and 99999; -- Having confirmed we can, indeed solve the problem, let's reproduce the OP's error --SELECT s.StoreNum, sum(YTD_Sales) as 'PLSales' --FROM myaccounts a --LEFT JOIN mystores as s on s.StoreNum = a.StoreNum --WHERE a.AcctKey >= 80000 and a.AcctKey <= 85000 --group by s.StoreNum; --SELECT s.StoreNum, sum(a.YTD_Sales) as 'PLShrink' --FROM myaccounts a --LEFT JOIN mystores as s on s.StoreNum = a.StoreNum --WHERE a.AcctKey >= 95000 and a.AcctKey <= 99999; -- Now, weave the two together select distinct i.storenum , isnull(j.plsales,0.00) as PLSales , isnull(k.plshrink, 0.00) as PLShrink , isnull(cast(j.plsales as float)/ cast(k.plshrink as float), 0.00) as 'Math' from mystores as i left outer join ( SELECT s.StoreNum, sum(a.YTD_Sales) as 'PLSales' FROM mystores as s left join myaccounts as a on s.StoreNum = a.StoreNum WHERE a.AcctKey >= 80000 and a.AcctKey <= 85000 group by s.StoreNum ) j on i.storenum = j.storenum left outer join ( SELECT s.StoreNum, sum(b.YTD_Sales) as 'PLShrink' FROM mystores as s left JOIN myaccounts as b on s.storenum = b.storenum WHERE b.AcctKey >= 95000 and b.AcctKey <= 99999 group by s.StoreNum ) k on i.storenum = k.storenum --on j.storenum = k.storenum order by storenum ;
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

terrancefisher avatar image
terrancefisher answered
All I can say is AWESOME Stuff!!! Off the topic question for you: How did you get that good with SQL? Trial and error? Books? Videos? By trade, I'm a .Net developer with some background in SQL, but nowhere near the expertise you have. Normally, there's a DBA that handles all the SQL stuff, but we don't have one here at this place of business. THANKS SO MUCH!!!
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.