# Updating an index with -1 or 9999, what is faster ?

 0 Scenario: A 100M record table A nonclustered index orver an integer column ordered forward. ( but could be a clustered index for any purpose). Action: Set this record as inactive by either updating its value to -1 or 9999 For test purposes lets say 50% of the data will be updated. I know that a huge update like 50% of the data will cause an index fragmentation. The table ( when clustered ) and the index ( nonclustered ) stores data "ordered", lets say, from 0 to 10. So, does updating it to -1 will cause all remaining values to be moved on the index and be slower then updating it to 9999 ( or max int size )? What is better/faster update it to -1 or 9999 ? or it doen't matter ? AND if we consider the defragmentation of the index as a required step ? Edit: A news question: Wont the update to the maximum value ( or 9999 ) affect further work with the table, for example, if 50% of the data is on the last pages of the table, any include/update will be in the middle of the data, so any pages after that could be affected... Edit2: I performed the following tests: ``````set nocount on go drop table temp select convert(int,id) as id into temp from BIG_TABLE create index IDX_id on temp(id ) GO dbcc dropcleanbuffers dbcc freeproccache set statistics io on set statistics time on update temp set id = -1 where id > 500000 and id < 999999999 select @@rowcount set statistics io off set statistics time off declare @c int = 0 while @c < 100000 begin insert temp(id) values ( @c ) select @c = @c + 1 end select @c set statistics io on set statistics time on ALTER INDEX IDX_id ON temp REORGANIZE set statistics io off set statistics time off GO drop table temp select convert(int,registration_id) as id into temp from BIG_REGISTRATIONs create index IDX_id on temp(id ) GO dbcc dropcleanbuffers dbcc freeproccache set statistics io on set statistics time on update temp set id = 999999999 where id > -1 and id <= 500000 select @@rowcount set statistics io off set statistics time off declare @c int = 500000 while @c < 600000 begin insert temp(id) values ( @c ) select @c = @c + 1 end select @c - 500000 set statistics io on set statistics time on ALTER INDEX IDX_id ON temp REORGANIZE set statistics io off set statistics time off GO `````` But i got inconclusive results, the difference was in miliseconds on both update and reorganize index even when a lot of insertions appeared in the middle of the pages.I am starting to guess there is no difference between minimum value ( or -1 ) and maximum value ( 9999 ) when updating an indexed column, What do you think ? more ▼ asked Oct 22, 2009 at 12:07 PM in Default Gustavo 592 ● 4 ● 4 ● 7 Kev Riley ♦♦ 53.8k ● 47 ● 49 ● 76 add new comment (comments are locked) 10|1200 characters needed characters left ▼ Viewable by all users

 0 Have you considered partitioned views? Partitioned tables? more ▼ answered Oct 22, 2009 at 04:36 PM Peso 1.6k ● 5 ● 6 ● 8 I will perform some tests with partitioned tables and post here the results. Thanks. Oct 27, 2009 at 08:46 AM Gustavo Partitioned tables worked fine. Unfortunatelly our production server is not 2008, this way Matt's idea of filtered index would be better. Anyway, just to report, it seems there is no difference between what number you update, since both options have its drawbacks. Nov 04, 2009 at 02:17 PM Gustavo Filtered indexes are a SQL Server 2008 feature. Partiotioned tables works in SQL Server 2005 as well as SQL Server 2008. However, you need the Enterprise Edition. Nov 20, 2009 at 10:12 AM Peso add new comment (comments are locked) 10|1200 characters needed characters left ▼ Viewable by all users
 0 If the index is nonclustered, then I recommend using a filter to exclude inactive records. That way, your index won't even include those values and thus will perform faster for active records. If you don't filter the index, then I think a very high number (beyond reach of valid values) would be better because it might help with certain queries that perform an index scan. I have no real knowledge or testing experience, but that is my gut feeling. Edit: different approach Is there a reason you need this inactive data? If so, is there a reason it can't be moved to a different table? I know this can cause a problem with referential integrity and foreign keys, but I suggestion that might work even if that is the case. This may or may not be worthwhile given your situation. stub table for maintaining foreign keys active records table with foreign key to stub table inactive records table with foreign key to stub table view with stub table and outer joins to include active and inactive records I think most queries could remain unchanged if the current table name is used for the active records. For those queries that need to check just inactive records, change to that table. For queries that need to check both, use the view. Obviously, if you need to check both a lot, this is not a good solution because it will mean using an outer join view a lot.We did something like this on a project a few years ago, and it worked quite well. (I just hope I'm not forgetting any details we discovered along the way.) more ▼ answered Oct 22, 2009 at 03:33 PM Tom Staab 5.8k ● 6 ● 8 ● 10 Thanks Tom, i cant used filtered index because my production environment is SQL server 2005, but this is a good advice anyway. But doesn't a very big number affect the current pages if more and more data will be added and the index pages of the deleted values would need to be reoganizaed frequently ? Oct 22, 2009 at 03:52 PM Gustavo You would need to consider the expected growth rate of the inactive records. I edited my answer to include a new potential solution. Please let me know what you think. Oct 22, 2009 at 04:13 PM Tom Staab I am studing partitioning this table right now, i will perform some tests but as far as i could see the query must indicate what partition to query so it will not be as soft as your view aproach but requires a less structures. Oct 27, 2009 at 08:38 AM Gustavo Nov 20, 2009 at 01:12 PM Peso add new comment (comments are locked) 10|1200 characters needed characters left ▼ Viewable by all users
 0 If you are on SQL Server 2008 then you can create a filtered index - which would ignore values that didn't match the filter. Therefore when you updated the rows, they would simply be removed from the index, which, while leaving spaces, wouldn't fragment it any further. This is, of course, assuming that you don't need the inactive rows to be accessed (i.e. you don't have a query that says 'select the inactive rows').You would then find that the cost of re-organising that index afterwards was significantly reduced. more ▼ answered Oct 22, 2009 at 03:25 PM Matt Whitfield ♦♦ 29.5k ● 61 ● 65 ● 87 Unfortunatelly our production server is 2005, but i agree that it would really be the best solution if "disabled" data is not acesses frequently... Oct 22, 2009 at 03:30 PM Gustavo add new comment (comments are locked) 10|1200 characters needed characters left ▼ Viewable by all users
 0 I'm staring at Kalen Delany's book on the SQL 2005 storage engine, but I don't recall anything in it precisely on point. I suspect that it would be slightly faster to set to 9999 because as rows were moved to new pages, the new pages could be populated without having to update pointers to the first page starting with 0 in your subject column. In other words, after the first page of -1s is populated, it would point to the first page containing 0's as the next page. When a second page for -1s is created, the first page of -1s might have to be altered to point to the second page of -1s rather than the starting page for 0s. The new page of -1s would point to the first page of 0s. However, I'm not sure whether or not such an update would have to happen for each new page, or just for the first page of 9999s. Also, I would expect both pages involved to be in ram at the time of the update, so the performance difference should be miniscule. Fragmentation should be a wash, because I wouldn't expect page splits to occur on the new pages with either value, and the old pages would be fragmented identically. However this is all speculation.My strong suggestion is to create a sample table with a million rows or so rows and conduct some time trials. Nothing beats testing to get a definitive answer. more ▼ answered Oct 22, 2009 at 01:19 PM Bob Hovious 1.6k ● 5 ● 6 ● 9 I performed the tests and edited the main post to include them but unfortunatelly i got no conclusive results. Oct 27, 2009 at 08:40 AM Gustavo add new comment (comments are locked) 10|1200 characters needed characters left ▼ Viewable by all users

### New code box

There's a new way to format code on the site - the red speech bubble logo will automatically format T-SQL for you. The original code box is still there for XML, etc. More details here.

By Email:

Need long-form SQL discussion? SQLserverCentral.com is the place.

Topics:

x1951
x990
x251

asked: Oct 22, 2009 at 12:07 PM

Seen: 2167 times

Last Updated: Nov 20, 2009 at 09:03 AM