question

Dave Myers avatar image
Dave Myers asked

Ghost process

The last couple of weeks we have had, what appears to be a ghost reindex job try and kick off on this server, even though there is not currently a reindex job on this server. We usually know when it kicks off because we get a failure alert through email: > DESCRIPTION:[sqsrvres] printODBCError: > sqlstate = 08S01; native error = 0; > message = [Microsoft][SQL Native > Client]Communication link failure > COMMENT: The Optimization Job has > failed, Review the Job History and > logs to determine the error and > specific remediation JOB RUN: (None) When I look at the SQL Server Log, I see no job process attempting to run or a manually generated DBCC Reindex process. But I do see the following: > 07/01/2010 10:23:33,spid2s,Unknown,SQL > Server has encountered 1 occurrence(s) > of I/O requests taking longer than 15 > seconds to complete on file > [E:\SQLLogs\Commerce_log.ldf] in > database [Commerce] (9). The OS file > handle is 0x00000938. The offset of > the latest long I/O is: > 0x000000004e4800 07/01/2010 > 10:23:33,spid2s,Unknown,SQL Server has > encountered 1 occurrence(s) of I/O > requests taking longer than 15 seconds > to complete on file > [E:\SQLLogs\SmartPay_log.ldf] in > database [SmartPay] (5). The OS file > handle is 0x00000914. The offset of > the latest long I/O is: > 0x000000079c4c00 I looked at the Application Event Log for the same time frame and I see the follwoing: > 07/01/2010 > 10:24:13,MSSQLSERVER,Error,[sqsrvres] > printODBCError: sqlstate = 08S01; > native error = 0; message = > [Microsoft][SQL Native > Client]Communication link > failure,Failover,1073760843,,HLSQLSRV02N1 > 07/01/2010 > 10:24:13,MSSQLSERVER,Error,[sqsrvres] > CheckQueryProcessorAlive: > sqlexecdirect > failed,Failover,1073760843,,HLSQLSRV02N1 > 07/01/2010 > 10:24:13,MSSQLSERVER,Error,[sqsrvres] > printODBCError: sqlstate = 08S01; > native error = 0; message = > [Microsoft][SQL Native > Client]Communication link > failure,Failover,1073760843,,HLSQLSRV02N1 > 07/01/2010 > 10:24:13,MSSQLSERVER,Error,[sqsrvres] > CheckQueryProcessorAlive: > sqlexecdirect > failed,Failover,1073760843,,HLSQLSRV02N1 > 07/01/2010 > 10:24:13,MSSQLSERVER,Error,[sqsrvres] > printODBCError: sqlstate = 08S01; > native error = 0; message = > [Microsoft][SQL Native > Client]Communication link > failure,Failover,1073760843,,HLSQLSRV02N1 > 07/01/2010 > 10:24:13,MSSQLSERVER,Error,[sqsrvres] > CheckQueryProcessorAlive: > sqlexecdirect > failed,Failover,1073760843,,HLSQLSRV02N1 > 07/01/2010 > 10:24:13,MSSQLSERVER,Error,[sqsrvres] > printODBCError: sqlstate = 08S01; > native error = 0; message = > [Microsoft][SQL Native > Client]Communication link > failure,Failover,1073760843,,HLSQLSRV02N1 > 07/01/2010 > 10:24:13,MSSQLSERVER,Error,[sqsrvres] > CheckQueryProcessorAlive: > sqlexecdirect > failed,Failover,1073760843,,HLSQLSRV02N1 > 07/01/2010 > 10:24:13,MSSQLSERVER,Error,[sqsrvres] > printODBCError: sqlstate = 08S01; > native error = 0; message = > [Microsoft][SQL Native > Client]Communication link > failure,Failover,1073760843,,HLSQLSRV02N1 > 07/01/2010 > 10:24:13,MSSQLSERVER,Error,[sqsrvres] > CheckQueryProcessorAlive: > sqlexecdirect > failed,Failover,1073760843,,HLSQLSRV02N1 > 07/01/2010 > 10:24:13,MSSQLSERVER,Error,[sqsrvres] > OnlineThread: QP is not > online.,Failover,1073760843,,HLSQLSRV02N1 > 07/01/2010 > 10:24:13,MSSQLSERVER,Error,[sqsrvres] > printODBCError: sqlstate = 08S01; > native error = 2746; message = > [Microsoft][SQL Native > Client]Communication link > failure,Failover,1073760843,,HLSQLSRV02N1 I am not sure what this means, but it almost appears like the server wants to fail over (although it never actually does). I do not know if it is a result of the ghost process or if it is the cause of the ghost process. I speculate that there is a process that is stuck in cache somewhere and if I were to fail the cluster over it might release this process. If anybody has experienced this or can add anything to my thought process, I would appreciate it.
optimizationfailoverprocess
5 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

How much of the environment is virtualised? What's the storage - local disk, NAS, SAN? Is it a regular occurrence? What else is happening when this happens?
0 Likes 0 ·
No virtual servers. Storage is done on SAN, this just started happening about 2 weeks ago and last about 5-10 seconds
0 Likes 0 ·
I've seen problems where backup systems cause I/O freezes for a few seconds - and the longer these freezes get, the more, erm, interesting the error messages become.
0 Likes 0 ·
I must be missing something - but how do you know this has anything to do with DBCC DBREINDEX at all? And what server type / service pack are you on?
0 Likes 0 ·
Matt, We speculate that is the process based on the alert message received (see first indent) This particular server is running on Windows Server 2003,SP2 and SQL Server Standard Edition, SP 2 running in a clusted environment
0 Likes 0 ·

1 Answer

·
David Wimbush avatar image
David Wimbush answered
Just chucking out ideas here but: Is auto-shrink on for this database? You could also look in the SSMS disk space usage report for grow or shrink events that correspond with the times you've had the error. If it's a big database and it autogrows by X% or autoshrinks, it can be quite a long operation.
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.