|
Can any one help me in getting duplicates in a table from every column
(comments are locked)
|
|
This is a very vague question that could get a really complicated solution. If you simply want to locate a duplicate row then you will need to use something like:
resolving the duplicates will be a whole new piece of work +1 : Stellar effort, considering...
May 18 '10 at 08:06 AM
Kev Riley ♦♦
(comments are locked)
|
|
There are a number of ways to solve this using TSQL. The best these days seem to revolve around using ROW_NUMBER(). The key is to simply understand the basic concept that you need a method to uniquely identify the row. Then you need a way to mark duplicate values for that unique identifier and then you need a mechanism to remove those duplicates. While this sounds like three steps, you should be able to do all this in a single query. Grant, have you got any examples of the Row_Number() option please? I started off thinking that way but then decided I would want all rows to see which row I wanted to call the duplicate - IE rows where Row_Number values are 1-n, not 2-n ... that led me to the nested query solution. J
May 18 '10 at 09:38 AM
Fatherjack ♦♦
This is from Simple-Talk...
May 18 '10 at 09:45 AM
Grant Fritchey ♦♦
Right, I see. I was thinking that the values in other columns might justfiy the row where nr=3 as the one to keep so rows 1,2+4 get deleted via application... Thanks.
May 18 '10 at 10:05 AM
Fatherjack ♦♦
If you have to get into making judgement calls, there's really no way to automate. Generally you have to define a mechanism for identifying what is a duplicate and then eliminate the extras. If that mechanism is "let me look at it" then...
May 18 '10 at 10:25 AM
Grant Fritchey ♦♦
(comments are locked)
|
|
I hope that I understand the question correctly. The task is to find the duplicate records across all columns in the table. I will also provide the sample of how to quickly delete all such duplicates. Lets create a heap table and insert some records in it (including some duplicates:
Now we have 4 occurences of (1, 1); 2 occurences of (2, 5); (3, 1) does not have any duplicates and we also have 2 occurences of (4, 6). Here is the script to quickly identify all the duplicates:
Here is the result of the query above:
Suppose we want to get rid of all dups while preserving all unique rows. In other words, the end result is expected to have #t with one (1, 1) record, one (2, 5) record, one (3, 1) record, and one (4, 6) record,. The statement to do this can be like this:
The above will delete all dups preserving the unique records only.
(comments are locked)
|


Can you provide a sample of your data?
Can you provide the name of the column you want to check the DUPE?