x

merge perfomance improvement

Hi All,

I am using a merge query to do an update(Please find the query below, There are around 30 columns in each table but i have just provided three here)

There are about 30 million records in the source table and 300,000 records in the destination table. There is a index in the joining column in both table. However this query takes forever to complete.Please provide suggestions.

 MERGE DestTable AS Dest
     USING 
         ( SELECT    Col1 ,
                     Col2 ,
                     Col3 ,
                     CHECKSUM(Col1, Col2, Col3) AS [CheckSUM]
           FROM      SourceTable
         ) AS SRC
     ON SRC.Col1 = Dest.Col1
     WHEN MATCHED and SRC.[CheckSum] <> Dest.[CheckSum]
         THEN 
 UPDATE    SET
         Dest.Col2 = SRC.Col2 ,
         Dest.Col3 = SRC.Col3 ,
         Dest.CheckSum = SRC.Checksum;
more ▼

asked Jun 13, 2013 at 11:59 AM in Default

avatar image

aRookieBIdev
2.8k 56 65 71

Any chance that you can provide the execution plan? Also, do you have to use a MERGE statement? I don't see any other action other than update so why can't you use a simple UPDATE statement with a join?

Jun 13, 2013 at 12:57 PM JohnM

hard to say for sure without the execution plan. However, the operation is less likely to get good index use and could lead to scans. It would be better if you could define absolute positives or negatives so that the indexes can be used appropriately.

Jun 13, 2013 at 02:19 PM Grant Fritchey ♦♦

will using a inner join help? i thought the merge does a join in the background?

Jun 13, 2013 at 04:29 PM aRookieBIdev

Again, no execution plan, I'm just guessing. Sorry.

Jun 14, 2013 at 07:13 AM Grant Fritchey ♦♦

Thanks. But the best thing would be to post the .sqlplan file. You can get that by right clicking on the plan and choosing Save as.

Jun 14, 2013 at 08:51 AM Grant Fritchey ♦♦
show all comments (comments are locked)
10|1200 characters needed characters left

2 answers: sort voted first

As @Grant points out, there are several functions that are affecting this query. These are:

  • [dbo].Humana_OrgHub_ValidateDUNS

  • [dbo].Humana_OrgHub_ValidatePhoneNum

  • CHECKSUM

  • SUBSTRING

  • REPLACE

I'm guessing that the ValidateDUNS function is a lookup; the ValidatePhoneNum is probably a stringslicer.

What might be worth doing is something along these lines:

  1. Create a temp table that contains the fields in the base table from the SRC part of the MERGE statement

  2. Populating that table with records that match the DEST table, but without doing the function stuff above

  3. using the temp table as the SRC of your MERGE, which would remain largely unchanged.

The problem with optimising queries that use inline functions is that the query / exectuion plan doesn't show the impact of those individual functions.

more ▼

answered Jun 14, 2013 at 05:32 PM

avatar image

ThomasRushton ♦♦
42k 20 51 53

I've just looked at this in SQL Sentry Plan Explorer. Boy, was I wrong... string slicing all the way.

Jun 14, 2013 at 05:48 PM ThomasRushton ♦♦

thanks , I have used a temp table .. stored the records that will be updated there , along with the calculated fields.The temp table also had the indexes on columns on which the join shall be made. Using this temp table i did an update on the final table.

Guess what... from 7 hours to 2 minutes now !!!!

Thanks a lot every one for helping me out.

Jun 16, 2013 at 11:12 AM aRookieBIdev
(comments are locked)
10|1200 characters needed characters left

The estimated number of rows is 50546400000. Either you're updating all the data in the table, or, your statistics are WAY out of date. Either way, it's going to be difficult to tune a query that is accessing that much information. The table [OrgHub].[dbo].[BusinessRecords] is a heap, meaning it doesn't have a clustered index, and there's no indication of any filtering on that table at all. You're joining to [OrgHub].[stg].[DnBOrgStg].[PK_DnBOrgStg_1] [Dest], but it doesn't have JOIN criteria, which you can see as a warning in the Loops operator in your plan. I would suggest modifying the query to include some type of JOIN information. Also, this operation:

[Expr1004] = Scalar Operator([OrgHub].[dbo].[Humana_OrgHub_ValidateDUNS]([OrgHub].[dbo].[BusinessRecords].[DUNS]))

Is absolutely preventing any sort of index use.

Basically, you need to get some indexes on the BusinessRecords table and you need to more clearly define the JOIN criteria between those tables so that there is some filtering done at the index level. Otherwise, it's trying to update everything.

more ▼

answered Jun 14, 2013 at 10:54 AM

avatar image

Grant Fritchey ♦♦
137k 20 44 81

(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.

Topics:

x2188
x1149
x310
x4

asked: Jun 13, 2013 at 11:59 AM

Seen: 842 times

Last Updated: Jun 17, 2013 at 04:08 AM

Copyright 2017 Redgate Software. Privacy Policy