question

YoungZ avatar image
YoungZ asked

T-SQL: Update group of rows based on rowid

Without any type of looping mechanism, I'm trying to update the groupid column of a table with the rowid of the first matched '\' character in the text column. I want to update this table: rowid | groupid | text 1 | NULL | H 2 | NULL | I 3 | NULL | \ 4 | NULL | M 5 | NULL | Y 6 | NULL | \ 7 | NULL | N 8 | NULL | A 9 | NULL | M 10 | NULL | E 11 | NULL | \ To look like this: rowid | groupid | text 1 | 3 | H 2 | 3 | I 3 | 3 | \ 4 | 6 | M 5 | 6 | Y 6 | 6 | \ 7 | 11 | N 8 | 11 | A 9 | 11 | M 10 | 11 | E 11 | 11 | \
sql-server-2005t-sqldeveloper
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Oleg avatar image
Oleg answered
This can be done with the quirky update with one minor tweak. Quirky update works only when the table is already clustered in the desired order. Since it is unknown how your actual table is clustered, it would be safe to create a temp table based on the data in the actual data but clustered by the reverse order of the records in that table. Lets assume that the name of the actual table is **dbo.test**, it has 3 columns named **rowid**, **groupid** and, sadly, a column named after keyword **text**. It has records just like the ones shown in the question post. Here is the solution consisting of creating temp table, updating its records with quirky update and finally, updating the actual table from its join with temp table: -- create temp table clustered by the column named N create table #temp_data ( N int not null primary key clustered, rowid int not null, groupid int null, [text] char(1) not null ); go -- insert data into it in the reverse order of -- the data in the original table named dbo.test insert into #temp_data select row_number() over (order by rowid desc) N, * from dbo.test; -- use quirky update to update the records in the temp table. -- this is possible because it is clustered in the desired order declare @groupid int; declare @anchor int; set @groupid = -1; -- this update is applied to properly clustered temp table update #temp_data set @groupid = groupid = case when [text] = '\' then rowid else @groupid end, @anchor = N option (maxdop 1); -- now the actual table can be updated from its join with temp table update t set groupid = temp.groupid from dbo.test t inner join #temp_data temp on t.rowid = temp.rowid; -- check the solution select * from dbo.test; -- results: rowid groupid text ----------- ----------- ---- 1 3 H 2 3 I 3 3 \ 4 6 M 5 6 Y 6 6 \ 7 11 N 8 11 A 9 11 M 10 11 E 11 11 \ Hope this helps, Oleg
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Pavel Pawlowski avatar image
Pavel Pawlowski answered
Here is a solution without any temp table and without quirky updates. Only simple selects with CTE :-) DECLARE @tbl TABLE ( rowid int, groupid int, [text] char(1) ) INSERT INTO @tbl(rowid, groupid, [text]) SELECT 1 , NULL ,'H' UNION ALL SELECT 2 , NULL ,'I' UNION ALL SELECT 3 , NULL ,'\' UNION ALL SELECT 4 , NULL ,'M' UNION ALL SELECT 5 , NULL ,'Y' UNION ALL SELECT 6 , NULL ,'\' UNION ALL SELECT 7 , NULL ,'N' UNION ALL SELECT 8 , NULL ,'A' UNION ALL SELECT 9 , NULL ,'M' UNION ALL SELECT 10 , NULL ,'E' UNION ALL SELECT 11 , NULL ,'\'; WITH Delims AS ( SELECT rowid as DelimRowID FROM @tbl WHERE [text] = '\' ), Data AS ( SELECT *, ROW_NUMBER() OVER(PARTITION BY t.rowID ORDER BY d.DelimRowID) AS RowNo FROM @tbl t LEFT JOIN Delims d ON t.rowid ]
7 comments
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

+1 This is a very good solution using what @Jeff Moden calls a triangular join (earlier records have many matches while later records get fewer and fewer of them until the latest group which only gets 1 match per record, thus the "triangular join" name). It actually looks like you can probably replace LEFT JOIN Delims d ON t.rowid BETWEEN t.rowid AND d.DelimRowID with INNER JOIN Delims d ON t.rowid
0 Likes 0 ·
@Oleg, you are right about the JOIN.. Going to update it. :-) as it's the same :-)
0 Likes 0 ·
Probably the fastest and also secure solution would be a writing a CLR scalar function for ranking the rows.
0 Likes 0 ·
@Pavel Pawlowski Nope, not really. I just tested your solution versus quirky update using a relatively small number of records (30,000). After I inserted these records (preserving the original order) I tested triangular join where I replaced the bottom select with this:
update Data 
set groupid = DelimRowID
where RowNo = 1
This took 3 minutes and 10 seconds. The problem with triangular join is the number of records to consider. True that the where clause excludes unwanted records in the end but still it considers them inside of the CTE, the one which has partitioned row\_number. The number of records to consider for just 30,000 records in the table happens to be over 122 million! (122,742,273 to be exact). This is the downside of the triangular join solutions. When I tested the quirky update solution, it took less than a blink of an eye for the whole thing, including creation of temp table, populating it with 30,000 records, updating it and finally updating the table from its join with that temp table. I will post the data creation part in the next comment due to lack of space.
0 Likes 0 ·
Here is the test data I used:
use AdventureWorks;
go

create table dbo.Test
(
    rowid int not null constraint PK_test primary key clustered, 
    groupid int null, 
    [text] char(1) not null
);
go

insert into dbo.Test (rowid, [text])
select top 30000 
    row_number() over (order by 
        a.[object_id], a.column_id, t.rowid) N, t.[text]
    from 
    (
        select 1 rowid, 'H' [text] union all select 2, 'I'
        union all select 3, '\' union all select 4, 'M'
        union all select 5, 'Y' union all select 6, '\'
        union all select 7, 'N' union all select 8, 'A'
        union all select 9, 'M' union all select 10, 'E'
        union all select 11, '\'
    ) t  
    cross join sys.all_columns a
go
This creates records like this:
rowid groupid text
----- ------- ----
1     NULL    H
2     NULL    I
3     NULL    \
4     NULL    M
5     NULL    Y
6     NULL    \
7     NULL    N
8     NULL    A
9     NULL    M
10    NULL    E
11    NULL    \
12    NULL    H
13    NULL    I
14    NULL    \
15    NULL    M
16    NULL    Y
17    NULL    \
18    NULL    N
19    NULL    A
20    NULL    M

-- etc
0 Likes 0 ·
Show more comments

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.