Published on

July 14, 2021

Merging Multiple Files in SQL Server

Introduction:

When working with data pipelines in Azure Data Factory (ADF), there are often situations where we need to combine records from multiple source files and consolidate them into a single output file. In this article, we will explore the steps to merge multiple files using SQL Server.

Solution:

For this demonstration, we will be using three files stored in a data lake storage folder: File1.csv, File2.csv, and File3.csv. Each file has a similar structure, with columns such as SALESORDERNUMBER, ISEXPORTSALE, CREATEDDATE, ISSELECTED, SALESORDERSTATUS, CURRENCYCODE, DELIVERYADDRESSCITY, DELIVERYADDRESSCOUNTRYREGIONID, and DELIVERYADDRESSSTATEID.

Let’s take a look at the content of these three source files:

File1.csv:

SALESORDERNUMBER,ISEXPORTSALE,CREATEDDATE,ISSELECTED,SALESORDERSTATUS,CURRENCYCODE,DELIVERYADDRESSCITY,DELIVERYADDRESSCOUNTRYREGIONID,DELIVERYADDRESSSTATEID
1,true,2021-01-01,false,Completed,USD,New York,US,NY
2,false,2021-01-02,true,In Progress,EUR,Paris,FR,75
3,true,2021-01-03,false,Completed,GBP,London,GB,ENG
4,false,2021-01-04,true,In Progress,USD,Los Angeles,US,CA
5,true,2021-01-05,false,Completed,EUR,Berlin,DE,BE

File2.csv:

SALESORDERNUMBER,ISEXPORTSALE,CREATEDDATE,ISSELECTED,SALESORDERSTATUS,CURRENCYCODE,DELIVERYADDRESSCITY,DELIVERYADDRESSCOUNTRYREGIONID,DELIVERYADDRESSSTATEID
20000,true,2021-01-01,false,Completed,USD,New York,US,NY
20001,false,2021-01-02,true,In Progress,EUR,Paris,FR,75
20002,true,2021-01-03,false,Completed,GBP,London,GB,ENG
20003,false,2021-01-04,true,In Progress,USD,Los Angeles,US,CA
20004,true,2021-01-05,false,Completed,EUR,Berlin,DE,BE

File3.csv:

SALESORDERNUMBER,ISEXPORTSALE,CREATEDDATE,ISSELECTED,SALESORDERSTATUS,CURRENCYCODE,DELIVERYADDRESSCITY,DELIVERYADDRESSCOUNTRYREGIONID,DELIVERYADDRESSSTATEID
1000,true,2021-01-01,false,Completed,USD,New York,US,NY
1001,false,2021-01-02,true,In Progress,EUR,Paris,FR,75
1002,true,2021-01-03,false,Completed,GBP,London,GB,ENG
1003,false,2021-01-04,true,In Progress,USD,Los Angeles,US,CA
1004,true,2021-01-05,false,Completed,EUR,Berlin,DE,BE

SQL Server provides various methods to merge these files into a single output file. One approach is to use the BULK INSERT statement to load the data from each file into a temporary table, and then insert the records from the temporary table into the final output table.

Here is an example of how this can be achieved:

-- Create a temporary table to hold the merged data
CREATE TABLE #MergedData (
    SALESORDERNUMBER INT,
    ISEXPORTSALE BIT,
    CREATEDDATE DATE,
    ISSELECTED BIT,
    SALESORDERSTATUS VARCHAR(50),
    CURRENCYCODE VARCHAR(3),
    DELIVERYADDRESSCITY VARCHAR(50),
    DELIVERYADDRESSCOUNTRYREGIONID VARCHAR(2),
    DELIVERYADDRESSSTATEID VARCHAR(2)
)

-- Load data from File1.csv into the temporary table
BULK INSERT #MergedData
FROM 'C:\Path\To\File1.csv'
WITH (
    FIELDTERMINATOR = ',',
    ROWTERMINATOR = '\n',
    FIRSTROW = 2 -- Skip the header row
)

-- Load data from File2.csv into the temporary table
BULK INSERT #MergedData
FROM 'C:\Path\To\File2.csv'
WITH (
    FIELDTERMINATOR = ',',
    ROWTERMINATOR = '\n',
    FIRSTROW = 2 -- Skip the header row
)

-- Load data from File3.csv into the temporary table
BULK INSERT #MergedData
FROM 'C:\Path\To\File3.csv'
WITH (
    FIELDTERMINATOR = ',',
    ROWTERMINATOR = '\n',
    FIRSTROW = 2 -- Skip the header row
)

-- Insert the merged data into the final output table
INSERT INTO OutputTable
SELECT *
FROM #MergedData

-- Clean up the temporary table
DROP TABLE #MergedData

Conclusion:

In this article, we have discussed the steps to merge multiple files into a single file using SQL Server. By leveraging the BULK INSERT statement and temporary tables, we can efficiently consolidate data from different sources into a unified output. It is important to ensure that all files have a similar structure for the merge process to work properly.

Click to rate this post!
[Total: 0 Average: 0]

Let's work together

Send us a message or book free introductory meeting with us using button below.