Remove Duplicates

I am trying to remove duplicates from a table. Not all the students have duplicates though. I will provide 2 scenarios below: Any sugestions? I am using this query but it just removes the duplicate ids. I need more logic based on status and year.

         FROM  ##temp_test
         WHERE ##temp_test.%%physloc%%
         NOT IN (SELECT MIN(##temp_test.%%physloc%%)
          FROM   ##temp_test 
          GROUP BY id)

The first scenario I would want the record that has STATUS = Final AND the most recent YEAR. (This is the 3rd record of id 1)

The second scenario since there IS NO Final I just will take the record with the most recent YEAR (This is the second record of id 2)

Scenario 1
1   Joe   Math  Final   2010
1   Joe   Math  Initial 2010
1   Joe   Math  Final   2011
Scenario 2
2   Bob   Eng   Initial 2011
2   Bob   Eng   Initial  2010
more ▼

asked May 04, 2012 at 05:12 PM in Default

sqlLearner 1 gravatar image

sqlLearner 1
802 36 42 46

Just to clarify, what about the scenario

3   Kev   Physics   Initial 2010
3   Kev   Physics   Final   2010
3   Kev   Physics   Initial 2011
should that give you the 2nd record (as it is a Final) or the 3rd record (as that is from the latest year)?
May 04, 2012 at 05:22 PM Kev Riley ♦♦
Initial 2011..Year takes precedence over status
May 04, 2012 at 05:59 PM sqlLearner 1
(comments are locked)
10|1200 characters needed characters left

2 answers: sort voted first

This example shows how to use row_number() to rank based on various criteria. I partition each id, and then order descending by year first (as you stated it has precedence) and then order descending by a calculated value that is either 1 if the status is 'Final' or 0 otherwise.

I then choose the row number 1's from that data set and I've eliminated the dupes.

declare @temp_test table (
    id int,
    name varchar(10),
    Class varchar(10),
    status varchar(10),
    [Year] int)

insert into @temp_test select 1,'Joe','Math','Final',2010
insert into @temp_test select 1,'Joe','Math','Initial',2010
insert into @temp_test select 1,'Joe','Math','Final',2011
insert into @temp_test select 2,'Bob','Eng','Initial',2011
insert into @temp_test select 2,'Bob','Eng','Initial',2010
insert into @temp_test select 3,'Kev','Physics','Initial',2010
insert into @temp_test select 3,'Kev','Physics','Final',2010
insert into @temp_test select 3,'Kev','Physics','Initial',2011

select *
from (
     rn=row_number()over(partition by id order by year desc, 
                   case when [status] = 'Final' then 1 else 0 end desc)
    from @temp_test
where a.rn = 1
more ▼

answered May 04, 2012 at 08:10 PM

Kev Riley gravatar image

Kev Riley ♦♦
53.8k 47 49 76

(comments are locked)
10|1200 characters needed characters left

You can also use this solution to delete the duplicates:

;WITH cte
SELECT *, ROW_NUMBER() OVER(PARTITION BY col1, col2 ORDER BY colN) AS rowcnt FROM yourtable
--remove this comment mark if you want the deleted records to end up in an archive table
INTO myArchive(col1,col2,colN )
WHERE rowcnt >1

Read more at this question:


more ▼

answered Oct 10, 2012 at 07:19 AM

Håkan Winther gravatar image

Håkan Winther
15.6k 35 37 48

(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

New code box

There's a new way to format code on the site - the red speech bubble logo will automatically format T-SQL for you. The original code box is still there for XML, etc. More details here.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here



Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.



asked: May 04, 2012 at 05:12 PM

Seen: 806 times

Last Updated: Oct 10, 2012 at 07:19 AM