x

Problem with Distributed Transactions after a Cluster Failover

I have 4 servers with two active/passive clusters, let's call them Cluster A with Servers A1 and A2 and Cluster B with server B1 and B2. All servers have SQL Server 2005 Enterprise Edition SP2 on Windows 2003. My problem is occuring on cluster B. When I failover to B2, Distributed Transactions fail for about 20 minutes for all linked server queries from cluster A to B and for all linked queries from cluster B to A. Linked Server queries without the Begin Distributed Transaction work fine right after the failover. After about 20 minutes, without making any changes, Distributed Transactions start back working. When I fail back to B1, Distributed Transactions work just fine immediately.

The settings on MSDTC( In Component Services) are the same for both B1 and B2. The error messages are usally any of the standand messages received from Linked Server. See the link below.

http://support.microsoft.com/kb/306212

Any help would be greatly appreciated.

more ▼

asked Oct 14 '09 at 11:24 AM in Default

Sylvester gravatar image

Sylvester
11 1 1 1

Sylvester, if this is the problem, please come back and select this answer.
Oct 15 '09 at 09:57 AM graz ♦
(comments are locked)
10|1200 characters needed characters left

3 answers: sort voted first

The order of the groups moving shouldn't cause a problem. When the SQL Server starts up are there any DTC related errors in the logs? How about after DTC starts working?

When the distributed transactions are failing is the failed transactions counter going up on the ClusterB?

Which check boxes to you have checked on the two clusters in the DTC Security area?

more ▼

answered Oct 19 '09 at 02:08 AM

mrdenny gravatar image

mrdenny
908 3

This weekend during our maintenance window, I was able to test again and I still ran into the same issues. I tried turning on DBCC TRACEON (3604, 7300) and didn't receive any additional information. Error message below.

"OLE DB provider "SQLNCLI" for linked server "linked server name" returned message "No transaction is active.".
Msg 7391, Level 16, State 2, Line 2
The operation could not be performed because OLE DB provider "SQLNCLI" for linked server "linked server name" was unable to begin a distributed transaction.
Oct 21 '09 at 06:42 PM Sylvester

When SQL Server starts, there are not any errors related to DTC. If I wasn't running a query using Distributed Transactions, I would not be able to determine if it's working or not. Nothing exists in any logs except the start and stop commands. The only time I see a reference to DTC outside of that is when I stop and start the service while troubleshooting.

I'm assuming the failed transactions you are refering to resides in Transaction Statistics? If so, then I can't be 100% sure that the counter is increasing. I wasn't monitoring that counter.
Oct 21 '09 at 06:44 PM Sylvester
My settings are as follows:
Network DTC Access - checked, Allow Remote Clients - checked
Allow Inbound - checked
Allow Outbout - checked
Incoming Caller Authentication Required - selected
Enable Transaction Internet Protocol Transactions - checked
Enable XA Transactions - checked (This option wasn't originally enabled. I just checked it recently while trying to resolve the issue.)
DTC Logon Account - NT Authority\Networkservice
Oct 21 '09 at 06:45 PM Sylvester
Is there any additional logging I can turn on to help identify the problem? The next time I failover, I will keep an eye on the failed transactions if you can confirm where I should look for that message. I also want to confirm that the DTC Names and IPs can see each other when I'm experiencing the problem. Last time when I checked, they could see each other, but the problem no longer when I tried running the Distributed Transaction. Should the DTC Logon account be a domain account instead of a system account? If there is anything else I should look out for please let me know.
Oct 21 '09 at 06:45 PM Sylvester
There's no extra logging you can turn on. On the Security tab disable the authentication. Authenticaion isn't supported on a cluster. In Component services navigate down to the Transaction Statistics (might be named something else, I'm looking at 2008). Do you see th failed transactions counter going up? Do you have a wirewall between the SQL Cluster and the other machines?
Oct 22 '09 at 02:50 AM mrdenny
(comments are locked)
10|1200 characters needed characters left

It sounds like MSDTC isn't starting immediately on the failover or is waiting for other services to get established before starting.

more ▼

answered Oct 15 '09 at 06:01 AM

RickD gravatar image

RickD
1.7k 1 1 4

When the Distributed Transactions were failing, I double checked Cluster Administrator to make sure the DTC was online and it was. I verified that the Distributed Transaction Coordinator was running and it was. I then stopped and started the DTC service and it still did not resolve the issue.

I was wondering if the order in which the groups are moved over to the new node matter. For example, I generally move the SQL Server Group over first and then DTC Group. Is that causing a problem?
Oct 15 '09 at 02:35 PM Sylvester
(comments are locked)
10|1200 characters needed characters left

I had exactly the same issue, SQL could not start a distributed transaction after a successful cluster node failover, even though the MSDTC appeared to be up-and-running. I found a hint to the issue here: link text

I added the SQL Service account local Administrators group on the server - then it all worked fine. You may have to uninstall/reinstall MSDTC on the cluster nodes to get everything in a working state first.
more ▼

answered Nov 03 '11 at 06:37 PM

Paul Bell gravatar image

Paul Bell
1

(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

New code box

There's a new way to format code on the site - the red speech bubble logo will automatically format T-SQL for you. The original code box is still there for XML, etc. More details here.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.

Topics:

x1933
x101
x44

asked: Oct 14 '09 at 11:24 AM

Seen: 3530 times

Last Updated: May 17 '13 at 02:06 AM