x

Ghost process

The last couple of weeks we have had, what appears to be a ghost reindex job try and kick off on this server, even though there is not currently a reindex job on this server. We usually know when it kicks off because we get a failure alert through email:

DESCRIPTION:[sqsrvres] printODBCError: sqlstate = 08S01; native error = 0; message = [Microsoft][SQL Native Client]Communication link failure COMMENT: The Optimization Job has failed, Review the Job History and logs to determine the error and specific remediation JOB RUN: (None)

When I look at the SQL Server Log, I see no job process attempting to run or a manually generated DBCC Reindex process. But I do see the following:

07/01/2010 10:23:33,spid2s,Unknown,SQL Server has encountered 1 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [E:\SQLLogs\Commerce_log.ldf] in database Commerce. The OS file handle is 0x00000938. The offset of the latest long I/O is: 0x000000004e4800 07/01/2010 10:23:33,spid2s,Unknown,SQL Server has encountered 1 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [E:\SQLLogs\SmartPay_log.ldf] in database SmartPay. The OS file handle is 0x00000914. The offset of the latest long I/O is: 0x000000079c4c00

I looked at the Application Event Log for the same time frame and I see the follwoing:

07/01/2010 10:24:13,MSSQLSERVER,Error,[sqsrvres] printODBCError: sqlstate = 08S01; native error = 0; message = [Microsoft][SQL Native Client]Communication link failure,Failover,1073760843,,HLSQLSRV02N1 07/01/2010 10:24:13,MSSQLSERVER,Error,[sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed,Failover,1073760843,,HLSQLSRV02N1 07/01/2010 10:24:13,MSSQLSERVER,Error,[sqsrvres] printODBCError: sqlstate = 08S01; native error = 0; message = [Microsoft][SQL Native Client]Communication link failure,Failover,1073760843,,HLSQLSRV02N1 07/01/2010 10:24:13,MSSQLSERVER,Error,[sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed,Failover,1073760843,,HLSQLSRV02N1 07/01/2010 10:24:13,MSSQLSERVER,Error,[sqsrvres] printODBCError: sqlstate = 08S01; native error = 0; message = [Microsoft][SQL Native Client]Communication link failure,Failover,1073760843,,HLSQLSRV02N1 07/01/2010 10:24:13,MSSQLSERVER,Error,[sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed,Failover,1073760843,,HLSQLSRV02N1 07/01/2010 10:24:13,MSSQLSERVER,Error,[sqsrvres] printODBCError: sqlstate = 08S01; native error = 0; message = [Microsoft][SQL Native Client]Communication link failure,Failover,1073760843,,HLSQLSRV02N1 07/01/2010 10:24:13,MSSQLSERVER,Error,[sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed,Failover,1073760843,,HLSQLSRV02N1 07/01/2010 10:24:13,MSSQLSERVER,Error,[sqsrvres] printODBCError: sqlstate = 08S01; native error = 0; message = [Microsoft][SQL Native Client]Communication link failure,Failover,1073760843,,HLSQLSRV02N1 07/01/2010 10:24:13,MSSQLSERVER,Error,[sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed,Failover,1073760843,,HLSQLSRV02N1 07/01/2010 10:24:13,MSSQLSERVER,Error,[sqsrvres] OnlineThread: QP is not online.,Failover,1073760843,,HLSQLSRV02N1 07/01/2010 10:24:13,MSSQLSERVER,Error,[sqsrvres] printODBCError: sqlstate = 08S01; native error = 2746; message = [Microsoft][SQL Native Client]Communication link failure,Failover,1073760843,,HLSQLSRV02N1

I am not sure what this means, but it almost appears like the server wants to fail over (although it never actually does). I do not know if it is a result of the ghost process or if it is the cause of the ghost process. I speculate that there is a process that is stuck in cache somewhere and if I were to fail the cluster over it might release this process.

If anybody has experienced this or can add anything to my thought process, I would appreciate it.
more ▼

asked Jul 01, 2010 at 09:10 AM in Default

Dave Myers gravatar image

Dave Myers
123 15 15 16

How much of the environment is virtualised? What's the storage - local disk, NAS, SAN? Is it a regular occurrence? What else is happening when this happens?
Jul 01, 2010 at 10:12 AM ThomasRushton ♦
No virtual servers. Storage is done on SAN, this just started happening about 2 weeks ago and last about 5-10 seconds
Jul 01, 2010 at 10:46 AM Dave Myers
I've seen problems where backup systems cause I/O freezes for a few seconds - and the longer these freezes get, the more, erm, interesting the error messages become.
Jul 01, 2010 at 01:09 PM ThomasRushton ♦
I must be missing something - but how do you know this has anything to do with DBCC DBREINDEX at all? And what server type / service pack are you on?
Jul 01, 2010 at 01:51 PM Matt Whitfield ♦♦
Matt, We speculate that is the process based on the alert message received (see first indent) This particular server is running on Windows Server 2003,SP2 and SQL Server Standard Edition, SP 2 running in a clusted environment
Jul 01, 2010 at 02:13 PM Dave Myers
(comments are locked)
10|1200 characters needed characters left

1 answer: sort voted first
Just chucking out ideas here but: Is auto-shrink on for this database? You could also look in the SSMS disk space usage report for grow or shrink events that correspond with the times you've had the error. If it's a big database and it autogrows by X% or autoshrinks, it can be quite a long operation.
more ▼

answered Jul 01, 2010 at 02:28 PM

David Wimbush gravatar image

David Wimbush
4.9k 28 30 33

(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

New code box

There's a new way to format code on the site - the red speech bubble logo will automatically format T-SQL for you. The original code box is still there for XML, etc. More details here.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.

Topics:

x36
x17
x7

asked: Jul 01, 2010 at 09:10 AM

Seen: 1946 times

Last Updated: Jul 01, 2010 at 09:10 AM