x

Troubleshooting failover cluster problem in W2K8 / SQL05

I have an active/passive W2K8 (64) cluster pair, running SQL05 Standard. Shared storage is on a HP EVA SAN (FC).

I recently expanded the filesystem on the active node for a database, adding a drive designation. The shared storage drives are designated as F:, I:, J:, L: and X:, with SQL filesystems on the first 4 and X: used for a backup destination.

Last night, as part of a validation process (the passive node had been offline for maintenance), I moved the SQL instance to the other cluster node. The database in question immediately moved to Suspect status.

Review of the system logs showed that the database would not load because the file "K:\SQLDATA\whatever.ndf" could not be found. (Note that we do not have a K: drive designation.)

A review of the J: storage drive showed zero contents -- nothing -- this is where "whatever.ndf" should have been.

Hmm, I thought. Problem with the server. I'll just move SQL back to the other server and figure out what's wrong..

Still no database. Suspect. Uh-oh. "Whatever.ndf" had gone into the bit bucket.

I finally decided to just restore from the backup (which had been taken immediately before the validation test), so nothing was lost but a few hours of sleep.

The question: (1) Why did the passive node think the whatever.ndf files were supposed to go to drive "K:", when this drive didn't exist as a resource on the active node?

(2) How can I get the cluster nodes "re-syncd" so that failover can be accomplished?

I don't know that there wasn't a "K:" drive as a cluster resource at some time in the past, but I do know that this drive did not exist on the original cluster at the time of resource move.

more ▼

asked Jan 20, 2010 at 03:44 PM in Default

user-981 (google) gravatar image

user-981 (google)
11 1 1 1

(comments are locked)
10|1200 characters needed characters left

2 answers: sort voted first

In a cluster you can only have data/log files on cluster drives, so it is not possible to place files on network drives or local drives.

And the info about where to find the files of a database is saved in master. Is it possible that someone changed it manually? OR it may be something about driveletter/mountpoints in the cluster perhaps?

more ▼

answered Jan 21, 2010 at 05:39 AM

Steinar gravatar image

Steinar
1.7k 3 4 6

(comments are locked)
10|1200 characters needed characters left

(1) Any chance that K: could have been a mapped network drive? If it was mapped on node1 then it would have been local to that node only; failing over to node2 where there was no such mapping would then cause the issue. If node1 was then restarted but the network drive hadn't been mapped permanently, then K: would have disappeared forever

Just a random thought....

(2) not sure what you mean by 're-sync' the nodes? The nodes just run the SQL engine, the data should be shared on cluster resources in your SAN

more ▼

answered Jan 20, 2010 at 06:12 PM

Kev Riley gravatar image

Kev Riley ♦♦
53k 47 49 76

(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

New code box

There's a new way to format code on the site - the red speech bubble logo will automatically format T-SQL for you. The original code box is still there for XML, etc. More details here.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.

Topics:

x1948
x92
x17

asked: Jan 20, 2010 at 03:44 PM

Seen: 2925 times

Last Updated: May 17, 2013 at 02:35 AM