question

Ronakshah112 avatar image
Ronakshah112 asked

Assign Unique group ID Based on matching all rows

Need to assign a Unique group Id based on the combination of multiple columns values from same table. Values are sometime Unique. Any form of SQL, SSIS, DQS will work for below request. Tried DQS but does not reach to destination. Table has about 2+ millions rows. For example first_name last_name & Email_address values match but PhoneNumber is different. Everything match but customerId might be different. Below are sample data. Final output in last column called finalgroupid. customerid EmailAddress First_Name Middle_Name LastName Address Phone Final Group ID 100 j@A.cooom JNE PIN 1 Wills PL Australia NJ 90210 1111111111 A123 206 JNE PIN 1 Wills PL Australia NJ 90210 A123 206 j@A.cooom JNE PIN 1 Wills PL Australia NJ 90210 1111111111 A123 206 j@A.cooom JNE PIN 1 Wills PLACE Australia NJ 90210 1111111111 A123 206 j@A.cooom JNE PIN 22 OX ST EW #A HURAT MA 30000 1111111111 A123 206 j@A.cooom JNE PIN 22 OX ST EW UNIT A HURAT MA 30000 1111111111 A123 206 j@A.cooom JNE PIN 22 OX ST EW UNIT A HURAT MA 30000 2222222222 A123 206 j@A.cooom JNE PIN C/O I AM THE BEST 35 FABULOUS OUTBACK CT 12345 1111111111 A123 206 j@A.cooom JNE PIN C/O I AM THE BEST 35 FABULOUS OUTBACK CT 12345 2222222222 A123 712 j@A.cooom And PIN 1 MOUNT Everest Frace GA 54358 2222222222 A123 712 j@A.cooom JNE PIN 1 Wills PL Australia NJ 90210 1111111111 A123 791 MA@nose.cooom M H JSN 63 AL RD VALLEY NM 10001 8888888888 A124 791 MA@nose.cooom MAR JSN 21 SUPERHIT CIR BOMBAY HI 11231 8888888888 A124 791 MA@nose.cooom MAR H JSN 63 AL RD VALLEY NM 10001 8888888888 A124 799 MA@hair.cooom M H JSN 63 AL RD VALLEY NM 10001 8888888888 A124 799 MA@hair.cooom MAR H JSN 63 AL RD VALLEY NM 10001 8888888888 A124 799 MA@hair.cooom SEA JSN 21 SUPERHIT CIR BOMBAY HI 11231 8888888888 A124 805 MA@nose.cooom M H JSN 27 W WOOD Mtn VALLEY NM 10000 8888888888 A124 805 MA@nose.cooom MAR H JSN 27 W WOOD Mtn VALLEY NM 10000 8888888888 A124 805 MA@nose.cooom V P C H 21 SUPERHIT CIR BOMBAY HI 11231 8888888888 A124 805 MA@nose.cooom VICTOR HKS 21 SUPERHIT CIR BOMBAY HI 11231 8888888888 A124 805 MA@nose.cooom VICTOR P CAS 21 SUPERHIT CIR BOMBAY HI 11231 8888888888 A124 809 MA@nose.cooom M H JSN 27 W WOOD Mtn VALLEY NM 10000 8888888888 A124 809 MA@nose.cooom MAR STK 21 SUPERHIT CIR BOMBAY HI 11231 8888888888 A124 809 MA@nose.cooom MAR H JSN 27 W WOOD Mtn VALLEY NM 10000 8888888888 A124 815 k@Y.cooom M H JSN 27 W WOOD Mtn VALLEY NM 10000 8888888888 A124 815 k@Y.cooom MAR H JSN 27 W WOOD Mtn VALLEY NM 10000 8888888888 A124 815 k@Y.cooom MAV JSN 21 SUPERHIT CIR BOMBAY HI 11231 8888888888 A124 815 k@Y.cooom MAV JSN 27 W WOOD Mtn VALLEY NM 10000 8888888888 A124 Please help. Thanks in advance.
sqlssismds
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

1 Answer

·
Tom Staab avatar image
Tom Staab answered
I believe the T-SQL DENSE_RANK window function is your best bet for this. You can even include multiple rankings in the same result set. For example, you could use something like this if you want to group on the combination of EmailAddress, Address, and Phone and also on the combination of CustomerId and Phone: SELECT EAPRank = CONCAT('EAP', DENSE_RANK() OVER (ORDER BY EmailAddress, Address, Phone)) , CPRank = CONCAT('CP', DENSE_RANK() OVER (ORDER BY CustomerId, Phone)) , EmailAddress, Address, Phone FROM MyTable ; You can also rank within a sub-group of the data by adding the PARTITION BY clause before the ORDER BY. Use it like you would a traditional GROUP BY.
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.