Uploading Incremental Data Changes to Azure Blob Storage with SQL Server

Welcome to Part 2 of our series on uploading data from SQL Server to Azure Blob Storage. In Part 1, we discussed how to copy a full SQL database table into Azure Blob Storage as a CSV file. In this article, we will focus on uploading incremental data changes to Azure after the initial data load.

When working with large databases, it is often necessary to only upload the changes made to the data instead of the entire dataset. This can save time and resources, especially when dealing with frequent updates. The general steps for uploading incremental changes in a table are as follows:

Enable Change Tracking on the database and table
Enable Snapshot Isolation on the database (optional)
Create a table to record Change Tracking versions
Create a stored procedure to update the Change Tracking table
Create Linked Services
Create Datasets
Make changes in SQL tables
Create Pipelines
Trigger Pipeline

Change tracking is a feature in SQL Server that keeps track of changes made to data in a table. It uses a version counter that increases as changes are made, allowing you to query for the changes since the last version. By enabling change tracking on your source SQL Server database, you can use Azure Data Factory (ADF) to retrieve the tracked changes and apply them to the data store, which in our case will be a CSV file uploaded to Azure Blob Storage.

To enable change tracking, you can run the following SQL statements:

ALTER DATABASE AdventureWorks2016
SET CHANGE_TRACKING = ON
(CHANGE_RETENTION = 2 DAYS, AUTO_CLEANUP = ON)

ALTER TABLE HumanResources.Department
ENABLE CHANGE_TRACKING
WITH (TRACK_COLUMNS_UPDATED = ON)

Enabling snapshot isolation on your database is optional but advised. It helps ensure data consistency in certain scenarios where the changed data results could be inconsistent. If you decide to enable snapshot isolation, you can use the following SQL statements:

ALTER DATABASE AdventureWorks2016
SET ALLOW_SNAPSHOT_ISOLATION ON;

-- Example of using snapshot isolation
SET TRANSACTION ISOLATION LEVEL SNAPSHOT;
BEGIN TRAN
-- Verify that the version of the previous synchronization is valid.
-- Obtain the version to use next time.
-- Obtain changes.
COMMIT TRAN

Next, you need to create a table to record the change tracking versions. This table will store the version number and the table name. Here is an example of how to create the table:

USE AdventureWorks2016;
GO

CREATE TABLE ChangeTrackingVersion
(
    tablename VARCHAR(255),
    sys_change_version BIGINT
);

DECLARE @sync_version BIGINT;
SET @sync_version = CHANGE_TRACKING_CURRENT_VERSION();

INSERT INTO ChangeTrackingVersion
VALUES ('[HumanResources].[Department]', @sync_version);

After creating the table, you can create a stored procedure to update the change tracking version. This stored procedure will be invoked by the ADF pipeline to update the version number for the next pipeline run. Here is an example of how to create the stored procedure:

USE [AdventureWorks2016];
GO

CREATE PROCEDURE [dbo].[Update_ChangeTrackingVersion]
    @CurrentChangeTrackingVersion BIGINT,
    @TableName VARCHAR(100)
AS
BEGIN
    UPDATE ChangeTrackingVersion
    SET sys_change_version = @CurrentChangeTrackingVersion
    WHERE tablename = @TableName;
END

Now that we have the necessary SQL Server configurations in place, we can move on to creating the Linked Services and Datasets in Azure Data Factory. These components establish the communication links between our SQL Server, ADF, and Azure Blob Storage.

In Part 1 of this series, we already created Linked Services for our SQL Server and Azure Blob Storage. We can reuse those linked services for uploading the incremental data. However, let’s quickly recap the steps to create the linked services:

Sign in to your Azure account and open Azure Data Factory.
Create a Source Linked Service for your SQL Server by providing the necessary details such as server name, database name, and authentication method.
Create a Sink Linked Service for your Azure Blob Storage by providing the necessary details such as the storage account name and authentication method.

With the Linked Services in place, we can now create the Datasets. In this step, we will create three datasets: one for the source data, one for the destination data (Azure Blob Storage), and one for the change tracking data. These datasets define the structure and location of the data.

Once the Datasets are created, we can proceed to create the Pipelines. The Pipeline is a workflow that orchestrates the data movement and transformation activities. In our case, the Pipeline will retrieve the old and new change versions, copy the changed data from SQL Server to Azure Blob Storage, and update the change version for the next pipeline run.

Finally, we can trigger the Pipeline to run on a regular basis. This can be done by scheduling the Pipeline to start at specific intervals or based on certain events.

By following these steps, you can efficiently upload incremental data changes from SQL Server to Azure Blob Storage. This approach saves time and resources by only transferring the necessary data.

Thank you for reading Part 2 of our series on uploading data to Azure Blob Storage with SQL Server. Stay tuned for more articles on SQL Server concepts and ideas!

For any further questions or assistance, please feel free to contact us at SQLRx@sqlrx.com. We are here to help!

Visit us at www.sqlrx.com for more information and resources.

Click to rate this post!

[Total: 0 Average: 0]

Comprehensive 360 Degree Assessment

Data Replication

Performance Optimization

Data Security

Database Migration

Expert Consultation

Cloud Migration Made Easy

Considering a move to the cloud? Axial SQL brings you proven migration strategies to streamline your transition. Our expert team ensures a smooth, efficient shift, keeping your data safe and accessible. Start your journey to the cloud with confidence!

SQL Performance Optimization

Is your SQL running slower than expected? Don't let sluggish performance hinder your business. Our optimization experts at Axial SQL specialize in tuning your databases for peak performance. Speed up your SQL and supercharge your data processing today!

Database Stability Solutions

Tired of frequent database outages? Discover stability with Axial SQL! Our comprehensive analysis identifies and resolves your database vulnerabilities. Enhance reliability, reduce downtime, and keep your operations running smoothly with our expert guidance.

Expert Database Team Evaluation

Questioning your database team's efficiency? Let Axial SQL provide an expert, unbiased analysis. We assess your team's strategies and workflows, offering insights and improvements to boost productivity. Elevate your database management to new heights!

Data Security Assurance

Concerned about your database security? Axial SQL is here to fortify your data defenses. Our specialized security assessments identify potential risks and implement robust protections. Keep your sensitive data secure and your peace of mind intact with our expert services.

Published on

Uploading Incremental Data Changes to Azure Blob Storage with SQL Server

Let's work together