Manage database concurrency

  • 10/9/2017

Skill 3.3: Optimize concurrency and locking behavior

SQL Server uses locks to control the effect of concurrent transactions on one another. Part of your job as an administrator is to improve concurrency by properly managing locking behavior. That means you need to understand how to uncover performance problems related to locks and lock escalations. Additionally, you must know how to use the tools available to you for identifying when and why deadlocks happen and the possible steps you can take to prevent deadlocks from arising.

Troubleshoot locking issues

Before you can troubleshoot locking issues, you must understand how SQL Server uses locks, which we describe in detail in Skill 3.1. As part of the troubleshooting process, you need to determine which resources are locked, why they are locked, and the lock type in effect.

You can use the following dynamic management views (DMVs) to view information about locks:

  • sys.dm_tran_locks Use this DMV to view all current locks, the lock resources, lock mode, and other related information.

  • sys.dm_os_waiting_tasks Use this DMV to see which tasks are waiting for a resource.

  • sys.dm_os_wait_stats Use this DMV to see how often processes are waiting while locks are taken.

Before we look at these DMVs in detail, let’s set up our environment as shown in Listing 3-7 so that we can establish some context for locking behavior.

LISTING 3-7 Create a test environment for testing locking behavior

CREATE TABLE Examples.LockingA
(
    RowId  int NOT NULL
        CONSTRAINT PKLockingARowId PRIMARY KEY,
    ColumnText  varchar(100) NOT NULL
);

INSERT INTO Examples.LockingA(RowId, ColumnText)
VALUES (1, 'Row 1'), (2, 'Row 2'), (3, 'Row 3'), (4, 'Row 4');
CREATE TABLE Examples.LockingB
(
    RowId  int NOT NULL
        CONSTRAINT PKLockingBRowId PRIMARY KEY,
    ColumnText  varchar(100) NOT NULL
);

INSERT INTO Examples.LockingB(RowId, ColumnText)
VALUES (1, 'Row 1'), (2, 'Row 2'), (3, 'Row 3'), (4, 'Row 4');

sys.dm_tran_locks

The sys.dm_tran_locks DMV provides you with information about existing locks and locks that have been requested but not yet granted in addition to details about the resource for which the lock is requested. You can use this DMV only to view information at the current point in time. It does not provide access to historical information about locks. Table 3-2 describes each column in sys.dm_tran_locks.

TABLE 3-2 sys.dm_tran_locks

COLUMN

DESCRIPTION

resource_type

One of the following types of resources: DATABASE, FILE, OBJECT, PAGE, KEY, EXTENT, RID, APPLICATION, METADATA, HOBT, or ALLOCATION_UNIT.

resource_subtype

If a resource has a subtype, this column displays it.

resource_database_id

The ID of the database containing the resource.

resource_description

Additional information, if available, about the resource not found in other resource columns.

resource_associated_entity_id

The ID of the entity with which the resource is associated, such as an object ID, HoBT ID, or Allocation Unit ID.

resource_lock_partition

The ID of the lock partition for partitioned lock resource. The value is 0 for a non-partitioned lock resource.

request_mode

The lock mode requested by waiting requests or granted for other requests.

request_type

This value is always LOCK.

request_status

One of the following values to reflect the current status of the request: GRANTED, CONVERT, WAIT, LOW_PRIORITY_CONVERT, LOW_PRIORITY_WAIT, or ABORT_BLOCKERS.

request_reference_count

The approximate number of times the requestor has requested the resource.

request_lifetime

This column is not supported.

request_session_id

The ID of the session that owns the request. An orphaned distributed transaction has a value of -2. A deferred recovery transaction has a value of -3.

request_exec_context_id

The ID of the execution context for the process that owns the request.

request_request_id

The ID of the request for the process that owns the request. This value changes when the active MARS connection for the transaction changes.

request_owner_type

The type of entity that owns the request: TRANSACTION, CURSOR, SESSION, SHARED_TRANSACTION_WORKSPACE, EXCLUSIVE_TRANSACTION_WORKSPACE, or NOTIFICATION_OBJECT.

request_owner_id

The ID of the owner of the request.

request_owner_guid

The GUID of the owner of the request.

request_owner_lockspace_id

This column is not supported.

lock_owner_address

The memory address of the internal data structure that is tracking the request. Join it with the resource_address column in sys.dm_os_waiting_tasks.

pdw_node_id

The ID for node in the Analytics Platform System (formerly known as Parallel Data Warehouse).

Let’s start some transactions to observe the locks that SQL Server acquires. In one session, execute the following statements:

BEGIN TRANSACTION;
    SELECT RowId, ColumnText
    FROM Examples.LockingA
    WITH (HOLDLOCK, ROWLOCK);

In a separate session, start another transaction:

BEGIN TRANSACTION;
    UPDATE Examples.LockingA
        SET ColumnText = 'Row 2 Updated'
        WHERE RowId = 2;

Now let’s use the sys.dm_tran_locks DMV to view some details about the current locks:

SELECT
    request_session_id as s_id,
    resource_type,
    resource_associated_entity_id,
    request_status,
    request_mode
FROM sys.dm_tran_locks
WHERE resource_database_id = db_id('ExamBook762Ch3');

Although your results might vary, especially with regard to identifiers, the DMV returns results similar to the example below. Notice the wait for the exclusive lock for session 2. It must wait until session 1 releases its shared range (RangeS-S) locks that SQL Server takes due to the HOLDLOCK table hint. This table hint is equivalent to setting the isolation level to SERIALIZABLE. SQL Server also takes intent locks on the table (which appears on the OBJECT rows of the results) and the page, with session 1 taking intent shared (IS) locks and session 2 taking intent exclusive (IX) locks.

s_id resource_type resource_associated_entity_id request_status   request_mode
---- -------------  ----------------------------- --------------   --------------
1    DATABASE       0                             GRANT            S
2    DATABASE       0                             GRANT            S
1    PAGE           72057594041729024             GRANT            IS
2    PAGE           72057594041729024             GRANT            IX
1    KEY            72057594041729024             GRANT            RangeS-S
1    KEY            72057594041729024             GRANT            RangeS-S
1    KEY            72057594041729024             GRANT            RangeS-S
1    KEY            72057594041729024             GRANT            RangeS-S
1    KEY            72057594041729024             GRANT            RangeS-S
2    KEY            72057594041729024             WAIT             X
1    OBJECT         933578364                     GRANT            IS
2    OBJECT         933578364                     GRANT            IX

Connect to the ExamBook762Ch3 database containing the resource and use one of the resource_associated_entity_id values from the previous query in the WHERE clause to see which object is locked, like this:

SELECT
    object_name(object_id) as Resource,
    object_id,
    hobt_id
FROM sys.partitions
WHERE hobt_id=72057594041729024;

When you view the results of this latter query, you can see the name of the resource that is locked, like this:

Resource object_id  hobt_id
-------- ---------- -------------------
LockingA 933578364  72057594041729024

In the previous example, you can also see the object_id returned from sys.partitions corresponds to the resource_associated_entity_id associated with the OBJECT resource_type in the DMV.

When troubleshooting blocking situations, look for CONVERT in the request_status column in this DMV. This value indicates the request was granted a lock mode earlier, but now needs to upgrade to a different lock mode and is currently blocked.

sys.dm_os_waiting_tasks

Another useful DMV is sys.dm_os_waiting_tasks. Whenever a user asks you why a query is taking longer to run than usual, a review of this DMV should be one of your standard troubleshooting steps. You can find a description of each column in this DMV in Table 3-3.

TABLE 3-3 sys.dm_os_waiting_tasks

COLUMN

DESCRIPTION

waiting_task_address

The address of the waiting task.

session_id

The ID of the session that owns the task.

exec_context_id

The ID of the execution context of the task.

wait_duration_ms

The total wait time for this wait type in milliseconds. This value in- cludes signal_wait_time_ms.

wait_type

The type of wait.

resource_address

The address of the resource for which the task is waiting.

blocking_task_address

The task that is currently holding the requested resource.

blocking_session_id

The ID of the session that is blocking the request. This column is NULL if the task is not blocked, -2 if the blocking resource is owned by an orphaned transaction, -3 if the blocking resource is owned by a deferred recovery transaction, and -4 if the session ID of the blocking latch owner cannot be determined due to internal latch state transitions.

blocking_exec_context_id

The ID of the execution context of the blocking task.

resource_description

The description of the resource consumed. See https://msdn.micro- soft.com/en-us/library/ms188743.aspx for more information.

pdw_node_id

The ID for node in the Analytics Platform System (formerly known as Parallel Data Warehouse).

In particular, you can use the sys.dm_trans_locks DMV in conjunction with the sys.dm_os_waiting_tasks DMV to find blocked sessions, as shown in Listing 3-8.

LISTING 3-8 Use system DMV sys.dm_tran_locks and sys.dm_os_waiting_tasks to display blocked sessions

SELECT
    t1.resource_type AS res_typ,
    t1.resource_database_id AS res_dbid,
    t1.resource_associated_entity_id AS res_entid,
    t1.request_mode AS mode,
    t1.request_session_id AS s_id,
    t2.blocking_session_id AS blocking_s_id
FROM sys.dm_tran_locks as t1
INNER JOIN sys.dm_os_waiting_tasks as t2
    ON t1.lock_owner_address = t2.resource_address;

Whereas the earlier query showing existing locks is helpful for learning how SQL Server acquires locks, the query in Listing 3-8 returns information that is more useful on a day-to-day basis for uncovering blocking chains. In the query results shown below, you can see that session 2 is blocked by session 1.

res_typ  res_dbid   res_entid            mode  s_id   blocking_s_id
-------  --------   ------------------   ----  ------  -----------------
KEY      27         72057594041729024    X     2      1

Execute the following statement in both sessions to release the locks:

ROLLBACK TRANSACTION;

sys.dm_os_wait_stats

The sys.dm_os_wait_stats DMV is an aggregate view of all waits that occur when a requested resource is not available, a worker thread is idle typically due to background tasks, or an external event must complete first. Table 3-4 explains the columns in sys.dm_os_wait_stats.

TABLE 3-4 sys.dm_os_wait_stats

COLUMN

DESCRIPTION

wait_type

The type of wait. The wait types associated with locks all begin with LCK.

waiting_tasks_count

The number of waits having this wait type. The start of a new wait increments this value.

wait_time_ms

The total wait time for this wait type in milliseconds. This value includes signal_wait_time_ms.

max_wait_time_ms

The highest wait time for this wait type in milliseconds.

signal_wait_time_ms

The amount of time in milliseconds between the time the waiting thread was signaled and the time it started running.

pdw_node_idpdw_node_id

The ID for node in the Analytics Platform System (formerly known as Parallel Data Warehouse

There are many wait types unrelated to locks, so when using the sys.dm_os_wait_stats DMV, you should apply a filter to focus on lock waits only, like this:

SELECT
    wait_type as wait,
    waiting_tasks_count as wt_cnt,
    wait_time_ms as wt_ms,
    max_wait_time_ms as max_wt_ms,
    signal_wait_time_ms as signal_ms
FROM sys.dm_os_wait_stats
WHERE wait_type LIKE 'LCK%'
ORDER BY wait_time_ms DESC;

The partial results of this query on our computer shown in the following example indicate that our SQL Server instance have the longest waits when threads are waiting for an exclusive (X) lock. On the other hand, the greatest number of waits is a result of waiting for a schema modification (SCH-M) lock. In both cases, the waits are caused because SQL Server has already granted an incompatible lock to the resource on another thread. This information is useful for identifying long-term trends, but does not show you details about the locked resources.

wait          wt_cnt  wt_ms    max_wt_ms  signal_ms
------------- ------- -------- ---------- ----------
LCK_M_X       6       1170670  712261     114
LCK_M_S       28      19398    2034       43
LCK_M_SCH_M   449     92       28         46
LCK_M_SCH_S   1       72       72         0

You can reset the cumulative values in the sys.dm_os_wait_stats DMV by executing the following statement: DBCC SQLPERF (N’sys.dm_os_wait_stats’, CLEAR);. Otherwise, these values are reset each time that the SQL Server service restarts.

Identify lock escalation behaviors

Lock escalation occurs when SQL Server detects too much memory, or too many system resources are required for a query’s locks. It then converts one set of locks to another set of locks applied to resources higher in the lock hierarchy. In other words, SQL Server tries to use fewer locks to cover more resources. As an example, SQL Server might choose to escalate a high number of row locks to a table lock. This capability can reduce overhead on the one hand, but can impact performance on the other hand because more data is locked. As a result, there is greater potential for blocking.

Lock escalation occurs when more than 40 percent of the available database engine memory pool is required by lock resources, or at least 5,000 locks are taken in a single T-SQL statement for a single resource. SQL Server converts an intent lock to a full lock, as long as the full lock is compatible with existing locks on the resource. It then releases system resources and locks on the lower level of the lock hierarchy. If the new lock is taken on a row or a page, SQL Server adds an intent lock on the object at the next higher level. However, if other locks prevent lock escalation, SQL Server continues attempting to perform the escalation for each new 1,250 locks it takes.

In most cases, you should let SQL Server manage the locks. If you implement a monitoring system, take note of Lock:Escalation events to establish a benchmark. When the number of Lock:Escalation events exceeds the benchmark, you can take action at the table level or at the query level.

Another option for monitoring lock escalation is to benchmark the percentage of time that intent lock waits (LCK_M_I*) occur relative to regular locks in the sys.dm_os_wait_stats DMV by using a query like this:

SELECT
    wait_type as wait,
    wait_time_ms as wt_ms,
    CONVERT(decimal(9,2), 100.0 * wait_time_ms /
    SUM(wait_time_ms) OVER ()) as wait_pct
FROM sys.dm_os_wait_stats
WHERE wait_type LIKE 'LCK%'
ORDER BY wait_time_ms DESC;

Capture and analyze deadlock graphs

Usually the process of locking and unlocking SQL Server is fast enough to allow many users to read and write data with the appearance that it occurs simultaneously. However, sometimes two sessions block each other and neither can complete, which is a situation known as deadlocking. Normally, the database engine terminates a thread of a deadlocked transaction with error 1205 and suggests a remedy, such as running the transaction again.

Let’s deliberately create a deadlock between two transactions. Start two sessions and add the following statements to the first session:

BEGIN TRANSACTION;
    UPDATE Examples.LockingA
        SET ColumnText = 'Row 1 Updated'
        WHERE RowId = 1;
    WAITFOR DELAY '00:00:05';
    UPDATE Examples.LockingB;
    SET ColumnText = 'Row 1 Updated Again'
    WHERE RowId = 1;

Next, in the second session, add the following statements:

BEGIN TRANSACTION;
    UPDATE Examples.LockingB
        SET ColumnText = 'Row 1 Updated'
        WHERE RowId = 1;
    WAITFOR DELAY '00:00:05';
    UPDATE Examples.LockingA;
    SET ColumnText = 'Row 1 Updated Again'
    WHERE RowId = 1;

Now execute the statements in the first session, and then, within five seconds, execute the second session’s statements. Only one of the transaction completes and the other was terminated with a rollback by SQL Server as shown by the following message:

Msg 1205, Level 13, State 51, Line 6
Transaction (Process ID 70) was deadlocked on lock resources with another process and
has been chosen as the deadlock victim. Rerun the transaction.

In this example, both transactions need the same table resources. Both transactions can successfully update a row without conflict and have an exclusive lock on the updated data. Then they each try to update data in the table that the other transaction had updated, but each transaction is blocked while waiting for the other transaction’s exclusive lock to be released. Neither transaction can ever complete and release its lock, thereby causing a deadlock. When SQL Server recognizes this condition, it terminates one of the transactions and rolls it back. It usually chooses the transaction that is least expensive to rollback based on the number of transaction log records. At that point, the aborted transaction’s locks are released and the remaining open transaction can continue.

Of course, deadlocks are not typically going to happen while you watch, so how can you know when and why they occur? You can use either SQL Server Profiler or Extended Events to capture a deadlock graph, an XML description of a deadlock.

SQL Server Profiler deadlock graph

If you use SQL Server Profiler to capture a deadlock graph, you must configure the trace before deadlocks occur. Start by creating a new trace, and connect to your SQL Server instance. In the Trace Properties dialog box, select the Events Selection tab, select the Show All Events check box, expand Locks, and then select the following events:

  • Deadlock graph

  • Lock:Deadlock

  • Lock:Deadlock Chain

On the Events Extraction Settings tab, select the Save Deadlock XML Events Separately option, navigate to a directory into which SQL Server Profiler saves deadlock graphs, and supply a name for the graph. You can choose whether to save all deadlock graphs in a single .xdl file or save multiple deadlock graphs as a separate .xdl file.

Now set up the deadlock scenario again to generate the deadlock graph. In one session, add the following statements:

BEGIN TRANSACTION;
    UPDATE Examples.LockingA
        SET ColumnText = 'Row 2 Updated'
        WHERE RowId = 2;
    WAITFOR DELAY '00:00:05';
    UPDATE Examples.LockingB
    SET ColumnText = 'Row 2 Updated Again'
    WHERE RowId = 2;

Next, in the second session, add the following statements:

BEGIN TRANSACTION;
    UPDATE Examples.LockingB
        SET ColumnText = 'Row 2 Updated'
        WHERE RowId = 2;
    WAITFOR DELAY '00:00:05';
    UPDATE Examples.LockingA
    SET ColumnText = 'Row 2 Updated Again'
    WHERE RowId = 2;

When a deadlock occurs, you can see the deadlock graph as an event in SQL Server Profiler, as shown in Figure 3-1. In the deadlock graph, you see the tables and queries involved in the deadlock, which process was terminated, and which locks led to the deadlock. The ovals at each end of the deadlock graph contain information about the processes running the deadlocked queries. The terminated process displays in the graph with an x superimposed on it. Hover your mouse over the process to view the statement associated with it. The rectangles labeled Key Lock identify the database object and index associated with the locking. Lines in the deadlock graph show the relationship between processes and database objects. A request relationship displays when a process waits for a resource while an owner relationship displays when a resource waits for a process.

FIGURE 3-1

FIGURE 3-1 A deadlock graph

Extended Events deadlock graph

In Extended Events, you can use the continuously running system_health session to discover past deadlocks. As an alternative, you can set up a new session dedicated to capturing deadlock information. The system_health session automatically captures detected deadlocks without requiring special configuration. That means you can analyze a deadlock after it has occurred.

To find deadlock information in the Extended Events viewer, open SQL Server Management Studio, connect to the database engine, expand the Management node in Object Explorer, expand the Extended Events node, expand the Sessions node, and then expand the System_health node. Right-click Package0.event_file, and select View Target Data. In the Extended Events toolbar, click the Filters button. In the Filters dialog box, select Name in the Field drop-down list, type xml_deadlock_report in the Value text box, as shown in Figure 3-2, and then click OK. Select Xml_deadlock_report in the filtered list of events, and then click the Deadlock tab below it to view the deadlock graph.

FIGURE 3-2

FIGURE 3-2 An Extended Events filter for xml_deadlock_report

Identify ways to remediate deadlocks

Deadlocks are less likely to occur if transactions can release resources as quickly as possible. You can also lock up additional resources to avoid contention between multiple transactions. For example, you can use a hint to lock a table although this action can also cause blocking.

Usually the best way to resolve a deadlock is to rerun the transaction. For this reason, you should enclose a transaction in a TRY/CATCH block and add retry logic. Let’s revise the previous example to prevent the deadlock. Start two new sessions and add the statements in Listing 3-9 to both sessions.

LISTING 3-9 Add retry logic to avoid deadlock

DECLARE @Tries tinyint
SET @Tries = 1
WHILE @Tries <= 3
BEGIN

    BEGIN TRANSACTION
    BEGIN TRY
        UPDATE Examples.LockingB
            SET ColumnText = 'Row 3 Updated'
            WHERE RowId = 3;
        WAITFOR DELAY '00:00:05';
        UPDATE Examples.LockingA
        SET ColumnText = 'Row 3 Updated Again'
            WHERE RowId = 3;
        COMMIT TRANSACTION;
    END TRY
    BEGIN CATCH
        SELECT ERROR_NUMBER() AS ErrorNumber;
        ROLLBACK TRANSACTION;
        SET @Tries = @Tries + 1;
        CONTINUE;
    END CATCH
END

Next, execute each session. This time the deadlock occurs again, but the CATCH block captured the deadlock. SQL Server does not automatically roll back the transaction when you use this method, so you should include a ROLLBACK TRANSACTION in the CATCH block. The @@TRANCOUNT variable resets to zero in both transactions. As a result, SQL Server no longer cancels one of the transactions and you can also see the error number generated for the deadlock victim:

ErrorNumber
-------------
1205

Re-execution of the transaction might not be possible if the cause of the deadlock is still locking resources. To handle those situations, you could need to consider the following methods as alternatives for resolving deadlocks.

  • Use SNAPSHOT or READ_COMMITTED_SNAPSHOT isolation levels. Either of these options avoid most blocking problems without the risk of dirty reads. However, both of these options require plenty of space in tempdb.

  • Use the NOLOCK query hint if one of the transactions is a SELECT statement, but only use this method if the trade-off of a deadlock for dirty reads is acceptable.

  • Add a new covering nonclustered index to provide another way for SQL Server to read data without requiring access to the underlying table. This approach works only if the other transaction participating in the deadlock does not use any of the covering index keys. The trade-off is the additional overhead required to maintain the index.

  • Proactively prevent a transaction from locking a resource that eventually gets locked by another transaction by using the HOLDLOCK or UPDLOCK query hints.