For businesses relying on SQL Server, database performance and availability are critical factors. A database crash or a lengthy transaction can result in significant downtime and have a negative impact on business operations. However, Microsoft SQL Server has introduced a powerful new feature called Accelerated Database Recovery (ADR) that greatly enhances database availability and performance.
In this blog post, we will explore how ADR works and the benefits it brings to SQL Server users. Let’s first take a look at the traditional SQL Server database recovery process before diving into ADR.
The SQL Server engine follows a three-phase process to restore a database to a stable state after an unexpected restart or crash:
- Analysis: The SQL Server engine reads all transactions in the log file.
- Redo: Completed transactions are redone.
- Undo: Incomplete transactions are undone.
Under this traditional process, the time it takes for the database engine to recover from an unexpected restart is almost proportional to the size of the longest active transaction in the log file at the time of the crash. Recovery also requires a rollback of all incomplete transactions, which can take a long time in the presence of long-running transactions. Additionally, canceling or rolling back a large transaction can be time-consuming.
Furthermore, when there are lengthy transactions, the database engine cannot truncate the transaction log, resulting in significant log file expansion and drive space consumption.
Now, let’s explore Accelerated Database Recovery. ADR, introduced in SQL Server 2019 (15.x) and improved in SQL Server 2022 (16.x), adds new mechanisms to SQL Server’s handling of transaction logs, speeding up the process of recovering a database from a crash or lengthy transaction.
In ADR, the three recovery steps are divided into sub-phases:
- Analysis phase:
- Read all transactions from the log file.
- Reconstruct the SLOG (system log stream) and copy log records for non-versioned operations.
- Redo phase:
- Redo from the SLOG (oldest uncommitted transaction up to the last checkpoint). Redo is a fast operation as it only needs to process a few records from the SLOG.
- Redo from the transaction log starts from the last checkpoint (instead of the oldest uncommitted transaction).
- Undo phase: Use the SLOG to undo non-versioned operations and the persisted version store (PVS) with logical revert to perform row-level version-based undo.
By versioning all physical database modifications and only undoing logical operations, which are sparse and can be reversed almost immediately, ADR achieves quick database recovery. Any transactions that were ongoing at the time of a crash are marked as aborted, and concurrent user queries may ignore any versions produced by these transactions.
ADR achieves quick recovery with the help of four key components:
- Persisted version store (PVS): Instead of using the conventional tempdb version store, the database engine uses the persisted version store to store the row versions created in the database itself. The PVS makes readable secondaries more readily available and enables resource isolation.
- Logical Revert: Row-level version-based undo is carried out by the asynchronous process known as logical revert, which also offers instantaneous transaction rollback and undo for all versioned operations. Logical revert is achieved by ADR by maintaining a log of all failed transactions, using PVS to roll back all user transactions immediately after a transaction abort, and releasing all locks.
- SLOG: The log records for non-versioned operations are stored in the SLOG, a secondary in-memory log stream. The SLOG accelerates redo and undo by processing only non-versioned operations. It is low volume and in memory, serialized during the checkpoint process to persist data on disk, and periodically truncated as transactions commit.
- Cleaner: The cleaner is an asynchronous process that periodically cleans page versions that are not needed.
ADR offers several crucial advantages that significantly improve database availability and performance:
- Faster Recovery Times: With ADR, SQL Server can quickly recover databases following a crash or a lengthy transaction. It can roll back only the changes made by the problematic transaction rather than rolling back the entire database by storing multiple versions of the same data in the transaction log.
- Improved Availability: ADR reduces the impact of long-running transactions or system failures on the accessibility of critical applications. Applications can resume operation more quickly after a crash, reducing downtime and achieving faster recovery times.
- Reduced Database Downtime: ADR speeds up the process of recovering a database following a crash or a protracted transaction. This, in turn, lessens the overall impact on business operations as critical applications can resume operation more quickly.
- Better Performance: ADR introduces a new mechanism for log truncation, enabling more frequent pruning of transaction logs. This results in a smaller transaction log overall and better database performance.
In conclusion, SQL Server’s Accelerated Database Recovery is a powerful feature that greatly enhances database availability and performance. ADR streamlines and expedites database recovery by introducing new methods for handling transaction logs, leading to shorter recovery times, greater availability, less downtime, and improved performance.