Hello, I am trying to remove duplicate records from a query and would like to remove them within a LEFT JOIN. I have two tables in the JOIN, one with unique records (table A) and I am joining it to table B using the primary key of table A where that key may exist in table B on more than one record. All records in Table B a unique record created date and I need to link to table B on the record that has the MAX created date. From here I have to join table C to table B using the primary key of table B (and the record of table B that had the MAX created date) as table C has data I need in my result set. I have tried may different methods (except a temp table as I am trying to do this without creating one), all with no luck. Does anyone have any suggestions?
You didn't mention the version of sql server. As of 2005, you can use windowing functions to achieve what you want. You would assign a [ROW_NUMBER()] partitioning by the primary key, ordering by the date column descending. This will give the newest record a rownumber of 1. You put all this into a CTE/sub-query and join to table c on the PK and rownumber 1. I have kept this theoretical, as judging by your username and question, I am guessing this is a homework question. If you want some more heelp, supply us with what you have in t-sql so far and we can take it from there. :
WilliamD probably has the best answer, but I often find it useful to consider more than one solution even when confident you have the best one. Another avenue would be select in the join only where there does not exist another matching record with a greater date. IN a very general sense it would look like select [colList] from TableA as a left join TableB as b on
b.fk and not exists (select * from tableB b2 where
b.fk and b2.date > b.date) left join TableC as c on
c.fk Of course, this assumes that the combination of fk and date in tableB is unique, otherwise you may still get multiple rows, but from your description this sounds like a safe assumption to make
William's answer is clearly the best way to accomplish the question objective, but just for the sake of examining all available options, you can consider another way of getting only one record per Key in TableA. This is not as elegant as the method with using the row_number() windowing funcition though: select a.KeyOfTableA, b.KeyOfTableB, b.CreatedDate, c.SomeColumn from TableA a outer apply ( select top 1 KeyOfTableB from TableB where KeyOfTableA = a.KeyOfTableA order by CreatedDate desc ) b left join TableC c on b.KeyOfTableB = c.KeyOfTableB; Due to the usage of the top with order by in the ITVF, it is very possible that this method emits a very strong and offensive odor as far as the query performance is concerned, but the reality is that there is no single best way to write T-SQL queries, and sometimes it does take to explore different possibilities and test, test, test in order to find the best one. Oleg