question

nirakar_jena avatar image
nirakar_jena asked

Data De-duplication with multiple scenario - SQL Server

I have a requirement as below.

I have a table and want to identify the duplicate records and find out the winner records and loser records based on the below criteria and insert the loser records into a different table with parentid .

Criteria as follows :

  1. SSN Match AND (Phonetic Last Name Match or Phonetic First Name Match) In this condition, the SSN should match and there is a phonetic match between either First Name or Last Name. Then max cust_no will be the owner
  2. DOB Match AND (Phonetic Last Name Match or Phonetic First Name Match) AND SSN Match (If not NULL) In this condition, there is a phonetic match between First Name or Last Name and Date of Birth must match and if SSN is not null, then SSN should also match. Then max cust_no will be the owner
  3. DOB Match AND (Phonetic Last Name Match and Phonetic First Name Match) AND SSN is NULL In this condition, there is a phonetic match on First Name and Last Name, Date of Birth must match and SSN must be null. Then max cust_no will be the owner

sample records

Scenario 1 First Name Last Name SSN DOB CUST Id Comments Record 1 John Houy 123 null 10 Duplicates Record 2 Jon H 123 1/1/2001 5

Record 1 John Houy 123456789 null Not Duplicates Record 2 Anna Murrey 123456789 null

Scenario 2 First Name Last Name SSN DOB Comments Record 1 Anna Murrey null 1/1/2001 Duplicates Record 2 Ana Murey null 1/1/2001

Record 1 Anna Murrey 123456789 1/1/2001 Not Duplicates Record 2 Ana Murey 987654321 1/1/2001

Record 1 Anna Murrey 123456789 1/1/2001 Duplicates Record 2 Ana Murey null 1/1/2001

Record 1 Anna Murrey null 1/1/2001 Duplicates Record 2 Ana Mary null 1/1/2001

Scenario 3 First Name Last Name SSN DOB Comments Record 1 Jon Houy null 1/1/1969 Duplicates Record 2 John Hoy null 1/1/1969

Record 1 John Trovolta null 1/1/1969 Not Duplicates Record 2 Jon To null 1/1/1969

I'm planning to use a T-SQL (stored procedure) or SSIS Can anyone please help me to achieve it.

Thank You

sqlssissql-serverstored procedures
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

0 Answers

· Write an Answer

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.