question

sujafiza avatar image
sujafiza asked

Generating groupID in Tsql

I have the following table in sql 2008 and I would like to assign a Group to every row in a way that same group should not have the same Id OR the name. If either the Id or the name repeats then it should get into the next group AND if they are different then they can stay in the same group. create table a(Id int, Name VARCHAR(10), Group int not null default 0) insert into a (Id, Name) VALUES (10, 'a'),(10,'b'), (20,'a'), (20,'b'),(30,'d'),(40,'e'), (10,'a'), (50,'a'), (20,'f') Can anybody suggest an optimal way for doing this? The expected output is something like below, Since ID 10 is already present in group 1 the next instance of 10 should fall under group 2. Same case with the name column as well. ID Name Group 10 a 1 30 d 1 40 e 1 20 f 1 10 b 2 20 a 2 20 b 3 10 a 3 50 a 4 Thanks, Susan
sqlquery
4 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Kev Riley avatar image Kev Riley ♦♦ commented ·
If you could post what you expect as the output, that would really help! Also the way you have asked the question suggests some order to the data, "name repeats" - is there some assumed/implied order?
1 Like 1 ·
Kev Riley avatar image Kev Riley ♦♦ commented ·
This seems random - why doesn't '20 b' go in to group 1 instead of '20 f'?
0 Likes 0 ·
Kev Riley avatar image Kev Riley ♦♦ commented ·
Also you've said 'the next instance of 10' - again this implies some order - what order?
0 Likes 0 ·
sujafiza avatar image sujafiza commented ·
There is no order followed, even 20 b can go to group 1, its just that in a group same ID or same name should not exist.
0 Likes 0 ·

1 Answer

·
Kev Riley avatar image
Kev Riley answered

This has had me scratching my head for a few days.

First I thought I could do this with a variation on 'Gaps and Islands', then I though it was a variation on 'Bin Packing'. In the end I've resorted to a row-by-row approach. I'd be very interested to see a set-based solution to this.

To enable me to deal with the data one row at a time, I had to introduce something that made each row unique - hence the cte with a UniqueID - once you've got that it's a simple loop until all rows have had a grouping assigned.

if object_id('a','U') is not null drop table a
create table a(Id int, Name VARCHAR(10), [Group] int not null default 0)
insert into a (Id, Name) VALUES (10, 'a'),(10,'b'), (20,'a'), (20,'b'),(30,'d'),(40,'e'), (10,'a'), (50,'a'), (20,'f')
    
while exists (select top 1 Id from a where [group] = 0)
begin
 with r as (select *, row_number()over(order by (select null)) as UniqueId from a)
 update a
  set [group] = 
   case 
    when (select min([group]) from r a2 where [group] > 0 and not (a2.id = a.id or a2.Name = a.name) and [group] not in (select [group] from r a2 where [group] > 0 and (a2.id = a.id or a2.Name = a.name)))is not null
    then (select min([group]) from r a2 where [group] > 0 and not (a2.id = a.id or a2.Name = a.name) and [group] not in (select [group] from r a2 where [group] > 0 and (a2.id = a.id or a2.Name = a.name)))
    else (select max([group]) from r)+1
   end
 from r a
 where UniqueId in (select top 1 UniqueId from r  where [group] = 0 order by UniqueId)
end
    
select * from a

returns

    Id          Name       Group
    ----------- ---------- -----------
    10          a          1
    10          b          2
    20          a          2
    20          b          1
    30          d          1
    40          e          1
    10          a          3
    50          a          4
    20          f          3
    
    (9 row(s) affected)
1 comment
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

sujafiza avatar image sujafiza commented ·
Thanks Kev for the solution. Even I was thinking there could be a no loop solution for this problem.
0 Likes 0 ·

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.