High data availability is crucial for every database administrator. Minimizing and mitigating database unavailability issues is essential to meet the RPO (Maximum allowed downtime) and RTO (Maximum acceptable data loss) goals of a production database service. In SQL Server, one way to achieve high availability is through the use of SQL Server Always On Availability Groups.
SQL Server Always On Availability Groups is a feature that allows for the creation of a logical group of user databases that failover together on secondary replicas. There are two types of data commit modes in SQL Server Always On Availability Groups: synchronous-commit mode and asynchronous-commit mode.
Synchronous-commit mode
In synchronous-commit mode, the primary replica waits for the transaction commit from a secondary replica before confirming to the client. Here is an overview of the data synchronization process in synchronous-commit mode:
- A client connects to the primary replica and issues a DML transaction.
- The primary replica generates a transaction log block and stores it in the log cache.
- The primary replica performs an automatic checkpoint, flushing the log data into the disk and copying the log records to the log pool.
- A thread called Log Capture reads the log pool data and sends it to each secondary replica.
- On each secondary replica, the Log Receive process writes the log records to the log cache.
- A Redo thread on each secondary replica writes the changes mentioned in the log records to the data and index pages.
- Once the secondary replica completes the transaction hardening process, it sends an acknowledgement to the primary replica.
- The primary replica then sends the transaction completion message to the client.
Asynchronous-commit mode
In asynchronous-commit mode, the primary replica does not wait for the acknowledgement from participating secondary replicas. Here is an overview of the data synchronization process in asynchronous-commit mode:
- A client connects to the primary replica and issues a DML transaction.
- The primary replica generates a transaction log block and stores it in the log cache.
- The primary replica performs an automatic checkpoint, flushing the log data into the disk and copying the log records to the log pool.
- The primary replica sends an acknowledgement to the client for transaction commit.
- On each secondary replica, the Log Receive process writes the log records to the log cache.
- A Redo thread on each secondary replica writes the changes mentioned in the log records to the data and index pages.
Synchronized Always On with Secondary replica down
In the event that a secondary replica is down in a synchronous-commit mode, the primary replica waits for a ping response from the secondary replica. If no response is received within a specified timeout period, the primary replica temporarily switches to asynchronous-commit mode. This ensures that users can continue executing DML without any issues, and they receive commit acknowledgements as soon as the transactions are committed on the primary replica. Once the secondary replica becomes available again, the primary replica switches back to synchronous-commit mode and resumes the usual data synchronization process.
If the secondary replica is removed from the availability group, SQL Server does not hold any transaction log records for that replica. It is recommended to remove the replica from the availability group if it takes longer to fix the issues with the secondary replica.
Conclusion
In this article, we explored the internals of data synchronization for SQL Server Always On Availability Groups. Understanding the end-to-end data flow is crucial for ensuring high availability and minimizing downtime. In upcoming articles, we will cover more on SQL Server Always On Availability Groups.