The volume of data retained, managed, and accessed today is unprecedented. Businesses expect the IT department to keep data online and accessible indefinitely, putting intense pressure on the databases required to store and manage it. To meet today’s needs, we need to replace outdated and inefficient legacy processes with new, more agile techniques. SQL Server Replication is one of the techniques to accommodate such demands.
SQL Server replication is a technology for copying and distributing data and database objects from one database to another and then synchronizing between databases to maintain consistency and integrity of the data. It allows for continuous copying and synchronization of data or can be scheduled to run at predetermined intervals. There are several different replication techniques that support a variety of data synchronization approaches, such as one-way, one-to-many, many-to-one, and bi-directional, to keep several datasets in sync with each other.
Transactional SQL Server Replication Components
The components of transactional SQL Server replication include:
- Publisher: The database that contains the objects designated as replication articles.
- Publication database: A logical collection of articles from a database.
- Articles: The basic unit of SQL Server Replication, consisting of tables, stored procedures, and views.
- Distributor: The database that acts as a storehouse for replication-specific data associated with one or more Publishers.
- Distribution database: The database that identifies and stores SQL Server replication status data, metadata about the publication, and acts as a queue for data moving from the Publisher to the Subscribers.
- Subscriber: The database instance that consumes SQL Server replication data from a publication.
- Subscription database: The target database of a replication model.
- Replication agents: Standalone programs and events used to carry out the tasks associated with data replication.
Articles in SQL Server Replication
An article is the basic unit of SQL Server Replication and can consist of tables, stored procedures, and views. Articles can be scaled horizontally and vertically using a filter option. Multiple articles can be created on the same object with some restrictions and limitations. The properties of an article can be set during the time of publication creation and can be viewed using the New Publication wizard. If a property requires a change after the publication is created, a new replication snapshot needs to be generated and all subscriptions reinitialized.
To list all the articles that are published, you can run the following T-SQL:
SELECT Pub.[publication] AS [PublicationName], Art.[publisher_db] AS [DatabaseName], Art.[article] AS [Article Name], Art.[source_owner] AS [Schema], Art.[source_object] AS [Object] FROM [distribution].[dbo].[MSarticles] Art INNER JOIN [distribution].[dbo].[MSpublications] Pub ON Art.[publication_id] = Pub.[publication_id] ORDER BY Pub.[publication], Art.[article]
To get detailed information about an article in the listed publisher, you can run the following T-SQL:
DECLARE @publication AS sysname; SET @publication = N'PROD_HIST_Pub'; USE MES_PROD_AP; EXEC sp_helparticle @publication = @publication;
Publications and Publishers
A publication is a logical collection of articles from a database. It allows for defining and configuring article properties at a higher level so that the properties are inherited by all the articles in that group. The publisher is a database that contains a list of objects designated as SQL Server replication articles, known as the publication database. The publisher can have one or more publications, each defining a logically related set of objects and data to replicate.
Distributor and Distribution Databases
The distributor is a database that acts as a storehouse for replication-specific data associated with one or more publishers. It can be a single database acting as both the publisher and the distributor, known as a “local distributor,” or it can be configured on a separate server, known as a “remote distributor.” The distribution database, associated with each publisher, identifies and stores SQL Server replication status data, metadata about the publication, and acts as a queue for data moving from the publisher to the subscribers.
Subscribers and Subscriptions
A subscriber is a database instance that consumes SQL Server replication data from a publication. It can receive data from one or more publishers and publications. Subscriptions define what publication data will be received, where, and when. There are two types of subscriptions: push subscriptions, where the distributor directly updates the data in the subscriber database, and pull subscriptions, where the subscriber checks regularly at the distributor for any new changes and updates the data in the subscription database itself.
Replication Agents
SQL Server replication uses a pre-defined set of standalone programs and events known as agents to carry out the tasks associated with data replication. The replication snapshot agent provides the required data set to perform the initial data synchronization of the publication database with the subscription database. The log reader agent moves replication transactions from the online transaction log of the publication database to the distribution database. The distribution agent applies the initial replication snapshot to the subscription database and tracks and records data changes in the distribution database. The merge agent is used with the merge replication model and handles bi-directional synchronization of data between the publisher and subscriber.
In summary, SQL Server replication is a powerful technology for copying and distributing data and database objects between databases. Understanding the components and topography of SQL Server replication is essential for effectively implementing and managing replication solutions.
If you have any questions or suggestions for improvement, please leave a comment below.