Problem: The Raw File source in SQL Server Integration Services (SSIS) can read data quicker than other file sources because there is no need to translate or parse the data when processing the file. This makes it faster to use raw files as data sources. In this article, we will explore how to create and use raw files in SSIS.
Why use raw files in SSIS?
There are several use cases for raw files in SSIS:
- Transfer datasets between SSIS packages faster
- Avoid the overhead of staging tables during SSIS development
- Create an inexpensive data lake
- Debug and log actions
- Provide an efficient intermediate format for serializing temporary results
- Reuse the same result set (raw file) multiple times in different data flows or packages
SSIS Raw File Source
In the Data Flow of SSIS, the raw file source is used to read raw files. It can only read the data from the file. Raw files require little parsing and no translation, making them faster to read. The raw file source is used to read data that was previously written using the Raw File Destination. The raw file format also contains sort information which is used during processing. The Raw File Source can be configured to ignore the sort flags in the file.
SSIS Raw File Destination
In the Data Flow of SSIS, the raw file destination is used to write raw files. The file format of the data is native to the destination, making it faster to write data compared to other destination options in SSIS such as flat files or OLE DB destinations. The Raw File Destination can also generate an empty raw file that contains just metadata. It can create a new file or append data to existing files.
Example: Using Raw Files in SSIS
Let’s consider a scenario where we need to move a large volume of data quickly using raw files in SSIS. We can use the raw file source/destination to achieve this. Instead of writing staging data to a database table, we can collect the data in staging as raw files before performing large writes to other destinations.
Here is an example of how to use raw files in SSIS:
USE RawData
GO
CREATE TABLE Customer_rowData
(
Id BIGINT PRIMARY KEY IDENTITY(1,1),
Name Nvarchar (400),
city Nvarchar (200),
Address Nvarchar (1000)
)
GO
Insert Customer_rowData
Select N'Bhavesh Patel',N'Ahmedabad',N'At-Vaishnodevi,Ahmedabad,Pin-382470,Gujarat,India'
GO 100000
1. Create a new SSIS project and name it “working_with_rawfiles”.
2. In the Solution Explorer, rename the package to “PrepareRawFile.dtsx”.
3. Configure the SSIS Control Flow by dragging a Data Flow Task into the Control Flow and renaming it “Prepare_RawFile”.
4. Open the Data Flow and drag an ADO NET Source from the SSIS Toolbox. Configure it to connect to your SQL Server and select the table created in the previous step.
5. Drag the Raw File Destination from the Toolbox and link it to the ADO NET Source.
6. Configure the Raw File Destination by specifying the file name or file name from a variable, selecting the write option, and configuring the input and output columns.
7. Execute the package to create the raw file in the desired location.
Reading Raw Files
Since the data in raw files is stored in a format specific to SSIS, we need to use a Raw File Source in the SSIS package to read the raw file. The steps for configuring the Raw File Source are similar to the steps mentioned above.
Conclusion:
Raw files in SQL Server Integration Services (SSIS) provide a faster way to read and write data compared to other file sources. They are useful for transferring datasets between SSIS packages, avoiding staging table overhead, creating data lakes, and debugging actions. By using raw files, we can optimize performance when working with large volumes of data. The Raw File Source and Raw File Destination components in SSIS make it easy to work with raw files and take advantage of their benefits.