SQL Server - connection timeout

We have some trouble with SQL server. In last two days sql sometimes timeout connection, becasuse there was some querry lock and waiting for finish. Out customer use full erp/crm system simplicor. I think, the problem is with speed of disk on server, actually I don´t have detail specificition of HW on this server.

There is locked query:

 (@refId_Project_parCurrentUserId uniqueidentifier,@lookupProjectCategoryResourceList_parCurrentUserResourceId uniqueidentifier,@ResourceList_parCurrentUserResourceId uniqueidentifier,@WorkQueueEntry_parWorkQueueId uniqueidentifier,@lookupOrganizationMappingList_parCurrentUserOrganizationId uniqueidentifier)
SELECT TOP 1 COUNT([WorkQueueEntry].[Id]) AS [Count] FROM [WorkQueueEntry] AS [WorkQueueEntry] 
WHERE EXISTS (SELECT FROM [Project] AS [refId_Project] 
WHERE [WorkQueueEntry].[refId] = [refId_Project].[Id] 
AND ( (EXISTS ((SELECT [lookupOrganizationMappingList].[refOrganizationId] AS [refOrganizationId] FROM [OrganizationMappingList] AS [lookupOrganizationMappingList] 
WHERE ( (([lookupOrganizationMappingList].[refOrganizationId] = [refId_Project].[refOrganizationId])) AND (([lookupOrganizationMappingList].[refParentOrganizationId] = @lookupOrganizationMappingList_parCurrentUserOrganizationId)))))) AND (( (([refId_Project].[refKeyAccountBusinessPartnerId] = @refId_Project_parCurrentUserId)) 
OR EXISTS ((SELECT [lookupProjectCategoryResourceList].[refMemberResourceId] AS [refMemberResourceId] FROM [ProjectCategoryResourceList] AS [lookupProjectCategoryResourceList] 
WHERE ( (([lookupProjectCategoryResourceList].[refMemberResourceId] = @lookupProjectCategoryResourceList_parCurrentUserResourceId)) 
AND ([lookupProjectCategoryResourceList].[refProjectCategoryAccessTypeId] IN ('c3517c5f-4b12-4610-a851-7c029e957063', '17764463-ef41-41e7-8f8a-797e5da10d52')) AND (([lookupProjectCategoryResourceList].[refProjectCategoryId] = [refId_Project].[refProjectCategoryId]))))) OR EXISTS ((SELECT [lookupProjectResource].[Id] AS [Id] FROM [ProjectResource] AS [lookupProjectResource] WHERE EXISTS (SELECT FROM [ResourceList] AS [ResourceList] 
WHERE [lookupProjectResource].[refResourceId] = [ResourceList].[refGroupResourceId] AND ( (([ResourceList].[refMemberResourceId] = @ResourceList_parCurrentUserResourceId)))) AND ( (([lookupProjectResource].[IsAccessible] = 1)) AND (([lookupProjectResource].[refProjectId] = [refId_Project].[Id]))))) OR (([refId_Project].[refProjectManagerBusinessPartnerId] = @refId_Project_parCurrentUserId)))))) AND ( (([WorkQueueEntry].[refWorkQueueId] = @WorkQueueEntry_parWorkQueueId)))

Any ideas?

It seems to be a case of an orphaned (or stalled) transaction on the SQL Server. By my knowledge this happens in the following cases.

Application Crash

The application crashed and the connections were not closed correctly and SQL Server did not close them by itself either.

Hardware Problem

Possible causes can also be:

  • Low memory
  • I/O issues caused by slow disk drives, arrays
  • Could also be a dying hard drive

How to Recover

When there is an orphaned connection (in your case it was Session ID 73) you can do the following:

  1. Kill the process – Right-click on the connection and select Kill command. That will release the connection and the waiting processes.
  2. Restart the SQL Server – this will obviously kill all the connections
  3. Restart the application server – this will also kill any connections to the SQL Server including the orphaned one

Long Time Solution

The database is like a car. When it is new it somehow runs by itself. But it needs some care after a while. Like a car needs fresh oil or changing worn parts, the database might need the same kind of service.

Diagnostics

You need to check the database health periodically (please check the following links for examples). In particular you need to

  • Monitor hard drive space and health
  • Monitor the overall server performance (RAM, CPU, I/O) – especially I/O is critical for the database
  • Run MS SQL Best Practices Analyzer (Microsoft Baseline Configuration Analyzer 2.0)
  • Monitor other processes that could interfere with the database operations (running antivirus scans, server backups, other applications)

Maintenance

Theses are the typical maintenance tasks that should be done on the database server by the database admin (DBA):

  • Change old/slow hard drives
  • Create/remove indexes – check the queries run by the users or use MS SQL Tuning Advisor
  • Increase RAM
  • Set up correctly other applications (e.g. other SQL Server instances or MS Exchange Server) so they use RAM correctly and don’t fight for it

Links

Diagnostics in SQL Server help detect stalled and stuck I/O operations
How to troubleshoot orphaned connections in SQL Server

What are you thinking about change types of concurrency control to optimistic? Default value in MS SQL server is pessimistic https://technet.microsoft.com/en-us/library/ms189132(v=sql.105).aspx.

I set logging of blocked process in database. A lot of problem is with lock of type Shared https://technet.microsoft.com/en-us/library/ms175519(v=sql.105).aspx

I was used to work with MySQL INNODB previously, which had REPEATABLE READ isolation level with SNAPSHOT - http://dev.mysql.com/doc/refman/5.0/en/set-transaction.html#isolevel_repeatable-read.

On the other hand, MSSQL has a default isolation level READ COMMITED (which is lower isolation level), but without snapshots.

I think shapshots can solve a lot of lock-wait and deadlock issues, becouse all read-only transaction works with their own version of database. To enable snaphoting increases memory consumption, but decreases a lock wait or dead-lock possibility.

When I was working on my first project on MSSQL with a large scale data, I was astonished how badly it worked wih default database settings. I had to use database hints very frequently.

Nice explanation of difference between transaction isolation levels:

I would personally try to use the same isolation level as MySQL. I think it shouldn’t be a risk, becouse of Origam update concurrency exception check (which is made after each update).

How to swich and what to watch - this artickle seems to be usefull: