In-Memory OLTP Programmability: Concurrency and Transaction Isolation for Memory-optimized Tables
The new In-Memory OLTP feature introduces memory-optimized tables and natively compiled stored procedures in SQL Server 2014, to optimize performance and scalability of OLTP workloads. To benefit from this technology in an existing SQL Server application, you need to migrate the performance critical tables and stored procedures to memory-optimized tables and natively compiled stored procedures respectively. End-to-end migration of tables and procedures was discussed in the blog post here. In this post we focus on the implications of the new transaction isolation model to application migration.
Transaction isolation levels for memory-optimized tables are implemented using optimistic concurrency control and therefore are lock-free. This means that SQL Server does not use locks to guarantee consistency in case multiple transactions attempt to access the same data. Instead, SQL uses a combination of snapshot-based isolation and conflict detection to guarantee consistency. For more details on transactions with In-Memory OLTP see the Books Online topic here. We will discuss the principles and the architecture of the lock-free engine for memory-optimized tables in more detail in a later post.
Snapshot-based isolation
All transactions read rows from memory-optimized tables as of a single point in time, usually the start of the transaction. This means that a running transaction does not observe changes made by other, concurrent, transactions that commit while the transaction is running.
Consider, for example, two transaction, tx1 and tx2, and a single memory-optimized table Tab, with a single row r1. The following table shows an example in which the transaction tx1 reads from the table, and tx2 inserts a row. The first column indicates the time; the second column indicates the contents of the table Tab.
Time |
Contents of Tab |
Operations in tx1 |
Operations in tx2 |
100 |
r1 |
BEGIN TRAN |
|
101 |
r1 |
… |
BEGIN TRAN |
102 |
r1 |
SELECT FROM Tab * returns (r1) |
INSERT Tab VALUES (r2) |
103 |
r1, r2 |
… |
COMMIT |
104 |
r1, r2 |
… |
|
105 |
r1, r2 |
SELECT FROM Tab * returns (r1) |
|
106 |
r1, r2 |
COMMIT |
Notice that at time 105, even though the table Tab contains two rows (r1, r2), the transaction tx1 only sees r1. All read operations in tx1 are executed as of time 100.
Snapshots in transaction isolation are not new in SQL Server. SNAPSHOT isolation has been available for disk-based tables since SQL Server 2005. However, the default isolation level for disk-based table remains read committed, and higher isolation levels such as REPEATABLE READ and SERIALIZABLE do not use snapshots. Memory-optimized tables use snapshots for all transaction isolation levels, including SNAPSHOT, REPEATABLE READ, and SERIALIZABLE.
You need to keep this in mind when migrating a table to memory-optimized, in particular when using READ COMMITTED isolation today. If there are no dependencies between concurrent transactions at the application level, in most cases you can use SNAPSHOT isolation on the migrated tables without changing the application. However, if there is a dependency, and the application relies on tx1 seeing the changes made by tx2, you need to make further changes to the app to handle snapshot-based isolation. For example, you could commit tx1 after the initial read operation at time 102, and start a new transaction, which would be able to see rows inserted by tx2.
For guidelines on transaction isolation levels with memory-optimized tables, and how to migrate apps that use the READ COMMITTED isolation level today, see Books Online.
Conflicts
Rather than taking locks to prevent concurrent access to a table, memory-optimized tables use conflict detection to enforce isolation of modifications, and to guarantee the higher isolation levels REPEATABLE READ and SERIALIZABLE.
If two concurrent transactions attempt to update the same row, one of the transactions will fail and roll back. Consider the following example, where two transaction attempt to update the same row r1.
Time |
Operations in tx1 |
Operations in tx2 |
100 |
BEGIN TRAN |
|
101 |
… |
BEGIN TRAN |
102 |
… |
UPDATE r1 – success |
103 |
UPDATE r1 – error – tx1 is aborted |
… |
104 |
COMMIT – success |
In this example, tx2 successfully updates r1 and later successfully commits. Transaction tx1 attempts to update r1 after it has already been updated by tx2. The update fails, and transaction tx1 is aborted. The first writer, in this case tx2, always wins. Transaction tx1 will need to be retried. In contrast, with READ COMMITTED isolation in disk-based tables, tx2 would take a lock when it updates r1. Transaction tx1 would block and wait until tx2 commits and the lock is released. At that point, i.e. after time 104, tx1 would proceed with the update to r1.
Validation of REPEATABLE READ and SERIALIZABLE isolation is done during commit processing. If SQL Server finds that the desired isolation level has been violated, the transaction is aborted at this time.
As conflicts cause transaction abort, transactions may need to be retried. For this reason, transactions that modify memory-optimized tables require logic to retry the transactions on failures. This retry logic can be implemented either in the client application, or through a wrapper stored procedure on the server. For more details on retry logic and a sample stored procedure implementing retry logic, see Books Online.
Download SQL Server CTP1and get started today, or see more blogs in the series introduction and index here!