We have a merge replication environment which includes
Publisher: Server1 where only one database of 125 GB in replication and also Distributor is also configured on same server hosted in a HyperV environment.
Subscribers: We have approx 280 servers with Distributor(push subscription) configuration.
NOTE: we have a environment with lots of DML changes in day so we perform many checks on a daily basis on every server through maintenance and backup jobs which includes step1 : check integrity for complete database step2: Rebuild/Reorganise Index step3: Backup/Verify step4: Old backup jobs clean up step5: Old job history clean up
Also this job completes in 4-5 hrs on a daily basis but it varies sometimes on server to server basis.
Snapshot Agent runs at daily at 00:05 AM
PROBLEM: Since sunday night we are having this problem where backup and maintenance job which is running till morning and most of the server jobs are stuck on step1 dbcc checkdb ('my database') (and Clients are also complaining that application is crashing , can't login , very slow)
Upon checking, my backup spid is waiting on OLEDB wait type and we do not have configured linked server connections on Publisher and other subscribers except few which are our head office servers from where we do our major imports / small updates / if we need to push out any database changes.
While checking the below query i can see the percentage is moving for my dbcc spid id but doesn't really tell me why it is stuck on this step? select session_id, percent_complete from sys.dm_exec_requests where percent_complete > 0
And it doesn't really matter which time I run this job or single step i.e. dbcc checkdb, it did not complete.
Also Check with our Windows team there was no update rolled out on weekend and no changes have been made on application side.
Any suggestions of what could be the problem ?