UPDATE (with a join) performance... can I improve this?


I'm trying to update the columns of one table to contain values copied from a table in a different database.

The update statement I'm using is:

    UPDATE loadbannerImpressions
       SET isp = ip.ipISP,
    	   region = ip.ipRegion,
    	   city = ip.ipCity,
    	   countryCode = ip.countryShort,
    	   country = ip.countryLong
  FROM loadbannerImpressions 
  INNER JOIN ip2location.dbo.ip2locationData ip 
        ON clientIPno BETWEEN ip.ipFrom AND ip.ipTo
     WHERE memberNo = 0

The ip2location.dbo.ip2locationData is a table the contains approx 7 million records with regional data based on IP addresses.
Sample Data:

ipFROM                 ipTO                   countrySHORT countryLONG      
---------------------- ---------------------- ------------ -----------------
0                      33554431               -            -                
33554432               33685503               UK           UNITED KINGDOM   
33685504               50331647               UK           UNITED KINGDOM   
50331648               67108863               US           UNITED STATES    
67108864               67109119               US           UNITED STATES

I've created non-clustered indexes on the ipFrom and ipTo columns as follows: create nonclustered index idx_ipFrom on ip2locationdata (ipFrom) ON FG_INDEXES create nonclustered index idx_ipTo on ip2locationdata (ipTo) ON FG_INDEXES

The loadBannerImpressions table contains approx. 500,000 rows.

The update statement tends to take a long time to execute, and I'm wondering if I'm going about this the wrong way. 500,000 rows, joining with a table of 7,000,000 rows is going to eat some CPU time, but I'm wondering if there's a better way I could be doing this?



more ▼

asked Feb 04, 2010 at 02:24 PM in Default

Bill gravatar image

105 6 6 7

(comments are locked)
10|1200 characters needed characters left

2 answers: sort voted first

Have you examined the execution plans to see which of the indexes are being used? In all likelihood you will probably need to create a compound index using both the IP columns. SQL Server can do index intersection, but it's not that common for it to do so. You may also need to include the columns from the select list to avoid the key lookup. But again, without the execution plan, it's hard to know.

What's the clustered index on ip2locationData?

more ▼

answered Feb 04, 2010 at 02:28 PM

Grant Fritchey gravatar image

Grant Fritchey ♦♦
101k 19 21 74

That's an interesting point. I remembered creating a compound index, which, as it happens, is the clustered index:

CREATE UNIQUE CLUSTERED INDEX [IDX_IP2LocationData] ON [dbo].[IP2LocationData]([ipFROM], [ipTO]) ON [PRIMARY]

The estimated execution plan indicates a table scan on the loadBannerClicks table, and a clustered index seek on the IP2LocationData table. Seems to make sense, based on my limited understanding of these details.
Feb 04, 2010 at 03:12 PM Bill
So, probably you don't need those other indexes at all. What's the index look like on the loadBannerClicks table?
Feb 04, 2010 at 03:39 PM Grant Fritchey ♦♦
There are no indexes on the table loadBannerClicks (or loadBannerImpressions.. sorry, I do the same task for both tables. I was referring to the loadBannerImpressions table in the first post). I didn't expect they'd make much difference, since I'm processing most of the records. I guess I could index the memberNo and/or the clientIPno columns, though. Might that help in this scenario?
Feb 04, 2010 at 03:56 PM Bill
I suspect the clientIPno could speed things up.
Feb 04, 2010 at 04:59 PM Grant Fritchey ♦♦

didn't change the estimated execution plan, but I'll try it out anyway and get back to you.


Feb 04, 2010 at 05:40 PM Bill
(comments are locked)
10|1200 characters needed characters left

In SQL Server 2008 and later it's best to avoid using "UPDATE with join". Use MERGE instead. MERGE is frequently more efficieant and is standard SQL whereas UPDATE...JOIN is proprietary. UPDATE...JOIN also gives unpredictable results if the join criteria is non-unique.

more ▼

answered Feb 04, 2010 at 03:46 PM

David 1 gravatar image

David 1
1.8k 1 3

Good tip. This is SQL Server 2005, however. I'm assuming I should continue as I am...

Feb 04, 2010 at 04:04 PM Bill
(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

New code box

There's a new way to format code on the site - the red speech bubble logo will automatically format T-SQL for you. The original code box is still there for XML, etc. More details here.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here



Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.



asked: Feb 04, 2010 at 02:24 PM

Seen: 3791 times

Last Updated: Feb 05, 2010 at 10:01 AM