question

Fatherjack avatar image
Fatherjack asked

Mirroring significantly degrades ETL process performance

I have an accounting system which we recently mirrored to our DR centre. Before this the users had an ETL process that extracts key data for analysis and takes about 3 or 4 minutes to complete. Since mirroring has been implemented this has increased to more than 150 minutes. Nothing seems to be stressed - the network is fine, the SQL Servers are fine, CPU doesnt peak, disk access (read/write/transfers per sec) on both servers is active but not at all high. I know the ETL process uses cursors. When running there is a peak of activity (Send Rate 1800KB/s New Trans 1800KB/s)) on the Primary followed immediately by the corresponding activity on the mirror (Restore Rate 1800KB/s as the process starts and then the send rate drops to ~ 50KB/s and New Trans rate of ~ 50KB/s and the mirror has a Restore Rate of the same ~50KB/s. If I suspend mirroring before the process starts the process takes the normal 3 or 4 minutes and then I resume mirroring and it catches up in about 6 minutes. The Send Rate rises to 7-800KB/s and the Restore Rate matches is at 800KB/s. Mirroring is Sync as it is on SQL Server 2005 Std Edition. I have noticed that Log Flushes and Log Waits rise when mirroring is active and the ETL process is executing. No other metrics seem to rise noticeably. Anyone have any ideas why and how I might be able to resolve this please? I am almost at the point of creating a method for the users to suspend and then resume the mirroring themselves.
sql-server-2005performancecursormirroring
8 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Kev Riley avatar image Kev Riley ♦♦ commented ·
suspending the mirror in live seems dangerous - you are losing your resilience whilst the ETL runs
0 Likes 0 ·
Fatherjack avatar image Fatherjack ♦♦ commented ·
yes but needing to refresh 6 times a day is only possible when it doesnt take over 2 hours. The vulnerability is for less than 10 minutes and the risk is in the ETL target which is wipe + write as far as I understand it so a failure just means we re-execute the job. After we get over the disaster.
0 Likes 0 ·
WilliamD avatar image WilliamD commented ·
Is the ETL Process also 3rd party? Sounds like that could get pushed off onto another box or re-written into a proper ETL process.
0 Likes 0 ·
Fatherjack avatar image Fatherjack ♦♦ commented ·
yes, it's all 3rd party code baked into the application. I am describing it as ETL as that is how I believe it works, actual purists may have alternative term for it.
0 Likes 0 ·
Kev Riley avatar image Kev Riley ♦♦ commented ·
Is the ETL the only thing that runs on this db? If so, it doesn't seem that mirroring is the right approach for DR. If not, are you going to prevent 'other' transactions from running whilst the ETL takes place?
0 Likes 0 ·
Show more comments
Blackhawk-17 avatar image
Blackhawk-17 answered
Where I generally see this scenario is when the transactions do processing in the same database; work tables and the like. If this can be offloaded to another dB and only the final results applied you will enhance performance. The issue is 2PC and that is why the throughput is reduced... the primary is waiting on acknowledgements of many small transactions and this time adds up. Check for ETL prep work going on in the Primary. **EDIT** ---------- Your other option is to move to Log Shipping rather than the suspend/resume route. It requires manual failover but the contention is probably removed and you don't have to worry about others forgetting steps that could compromise your recovery objectives. Either way you are in a tricky spot.
5 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

sp_lock avatar image sp_lock commented ·
Have you checked the perfmon counters for mirroring?
1 Like 1 ·
Fatherjack avatar image Fatherjack ♦♦ commented ·
bah, 3rd party app - I get no access to configure internals. I figured it was lots of chatter but was hoping for confirmation so thanks for that. off to build some sort of UI to allow users to suspend/resume mirror - they want to do the ETL 5 or 6 times a day.
0 Likes 0 ·
Usman Butt avatar image Usman Butt commented ·
+1. This kind of technique is used by us anyways especially with either BULK-LOGGED OR SIMPLE recovery model staging databases.
0 Likes 0 ·
Fatherjack avatar image Fatherjack ♦♦ commented ·
@sp_lock - not actually looked at the perfmon ones, will take a look next time it runs. Thanks
0 Likes 0 ·
Fatherjack avatar image Fatherjack ♦♦ commented ·
Log shipping is a possibility but wouldn't provide any better DR protection than a suspend/resume of the mirroring! Yes, it's a PITA!
0 Likes 0 ·
Usman Butt avatar image
Usman Butt answered
Too bad it is the 3rd party process. But as Kev Riley rightly said, this could be dangerous. The matter could get more serious, if a GUI is to be made to suspend and resume (could pile a heck of log records/ bigger send queue with the possibilities like if someone forgot to resume it in time). So to me, the suspend and resume should be starting and ending points of the ETL process respectively. Moreover, one cannot be sure that if this process is taking 3/4 minutes now, then the execution time will remain the same especially when CURSORS are used. Believe me, I had a terrible time while dealing with a job having multiple cursors. So the ten minutes cushion may rise to 30 minutes in a bad case and the data loss could be bigger in magnitude. Which may not be allowed according to the DR policy. I do not know whether it is possible or not, but the best would be to involve all the stakeholders and if a change in 3rd party code can do a better job, then it should be the course of action. I know this is an awkward situation, but suspending/resuming process should be the last resort IMHO. And if is implemented, it should be monitored carefully and documented fully in DR policy to inform all the stakeholder about the risks involved. BTW, I would be very interested to know how it is going to be documented. If you can please share some basic info here OR on my email, I would be very thankful to you.
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

geoched avatar image
geoched answered
Use High Performance Mirroring, i.e switch the Db to Async Mirroring mode.
4 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Kev Riley avatar image Kev Riley ♦♦ commented ·
OP is using standard edition - you need Enterprise for Async
2 Likes 2 ·
Fatherjack avatar image Fatherjack ♦♦ commented ·
Yeah, switching to Async involves switching to license from Standard to Enterprise
0 Likes 0 ·
geoched avatar image geoched commented ·
Then try code wrapped in Begin.Trans Commit. This should stop having to go across network for each commit .
0 Likes 0 ·
Fatherjack avatar image Fatherjack ♦♦ commented ·
Its a 3rd party app. I have no access to the code. the question is moot now anyway, I solved the issue months ago. Thanks for your input though
0 Likes 0 ·

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.