question

sqlLearner 1 avatar image
sqlLearner 1 asked

Remove Duplicates

I am trying to remove duplicates from a table. Not all the students have duplicates though. I will provide 2 scenarios below: Any sugestions? I am using this query but it just removes the duplicate ids. I need more logic based on status and year. delete FROM ##temp_test WHERE ##temp_test.%%physloc%% NOT IN (SELECT MIN(##temp_test.%%physloc%%) FROM ##temp_test GROUP BY id) The first scenario I would want the record that has STATUS = Final AND the most recent YEAR. (This is the 3rd record of id 1) The second scenario since there IS NO Final I just will take the record with the most recent YEAR (This is the second record of id 2) > Scenario 1 > ID NAME CLASS Status YEAR > 1 Joe Math Final 2010 > 1 Joe Math Initial 2010 > 1 Joe Math Final 2011 > Scenario 2 > 2 Bob Eng Initial 2011 > 2 Bob Eng Initial 2010
tsqlduplicates
2 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Kev Riley avatar image Kev Riley ♦♦ commented ·
Just to clarify, what about the scenario 3 Kev Physics Initial 2010 3 Kev Physics Final 2010 3 Kev Physics Initial 2011 should that give you the 2nd record (as it is a Final) or the 3rd record (as that is from the latest year)?
0 Likes 0 ·
sqlLearner 1 avatar image sqlLearner 1 commented ·
Initial 2011..Year takes precedence over status
0 Likes 0 ·
Kev Riley avatar image
Kev Riley answered
This example shows how to use `row_number()` to rank based on various criteria. I partition each id, and then order descending by year first (as you stated it has precedence) and then order descending by a calculated value that is either 1 if the status is 'Final' or 0 otherwise. I then choose the row number 1's from that data set and I've eliminated the dupes. declare @temp_test table ( id int, name varchar(10), Class varchar(10), status varchar(10), [Year] int) insert into @temp_test select 1,'Joe','Math','Final',2010 insert into @temp_test select 1,'Joe','Math','Initial',2010 insert into @temp_test select 1,'Joe','Math','Final',2011 insert into @temp_test select 2,'Bob','Eng','Initial',2011 insert into @temp_test select 2,'Bob','Eng','Initial',2010 insert into @temp_test select 3,'Kev','Physics','Initial',2010 insert into @temp_test select 3,'Kev','Physics','Final',2010 insert into @temp_test select 3,'Kev','Physics','Initial',2011 select * from ( select *, rn=row_number()over(partition by id order by year desc, case when [status] = 'Final' then 1 else 0 end desc) from @temp_test )a where a.rn = 1
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Håkan Winther avatar image
Håkan Winther answered
You can also use this solution to delete the duplicates: ;WITH cte AS( SELECT *, ROW_NUMBER() OVER(PARTITION BY col1, col2 ORDER BY colN) AS rowcnt FROM yourtable ) DELETE FROM cte /* --remove this comment mark if you want the deleted records to end up in an archive table OUTPUT DELETED.col1, DELETED.col2, DELETED.colN INTO myArchive(col1,col2,colN ) */ WHERE rowcnt >1 Read more at this question: [ http://ask.sqlservercentral.com/questions/48287/identify-and-remove-duplicates-help.html][1] [1]: http://ask.sqlservercentral.com/questions/48287/identify-and-remove-duplicates-help.html
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.