Manipulating Binary Large Objects (BLOBs) and Character Large Objects (CLOBs) has always been difficult in SQL Server. However, with the introduction of SQL Server 2005, new data types (NVARCHAR(MAX), VARCHAR(MAX), and VARBINARY(MAX)) have been provided for large object storage, allowing for better manipulation and the ability to process these data types using CLR procedures and functions.
The new data types in SQL Server 2005 include:
- VARCHAR(MAX): for storing CLOBs
- NVARCHAR(MAX): for storing CLOBs with Unicode characters
- VARBINARY(MAX): for storing BLOBs
These new data types can be used with a wide range of T-SQL commands and behave much like traditional VARBINARY(n), VARCHAR(n), and NVARCHAR(N) types. They should replace the TEXT and IMAGE types from previous versions of SQL Server. According to SQL Server Books Online, TEXT and IMAGE columns should not be used in new development, and legacy applications should be changed to use the new types.
One of the advantages of these new data types is that they can be used as variables and function parameters, and can be returned by CLR (or T-SQL) scalar-value functions. This makes them great candidates for compression.
In this article, we will explore how to create CLR functions to seamlessly compress and decompress large data objects with minimum performance impact using compression algorithms included in the .NET Framework 2.0.
Compressing a BLOB
Compressing a BLOB in SQL Server 2005 is as easy as passing a SQLBytes parameter, reading its content, and writing it to a compression stream. The compression stream writes to a MemoryStream that is later used to create a new SQLBytes object that can be returned to SQL Server applications or directly to the client.
Here is an example of a function to compress BLOBs in SQL Server:
CREATE FUNCTION [dbo].[fn_compress] (@blob VARBINARY(MAX))
RETURNS VARBINARY(MAX)
AS EXTERNAL NAME [BlobCompression].[UserDefinedFunctions].[fn_compress];
To use this function, you can simply pass a VARBINARY(MAX) parameter and it will return the compressed BLOB.
Decompressing a BLOB
Decompressing a BLOB follows similar principles to compression. The function reads from a stream and returns a decompressed block that can be used to create and return a decompressed SQLBytes object.
Here is an example of a function to decompress a BLOB:
CREATE FUNCTION [dbo].[fn_decompress] (@compressedBlob VARBINARY(MAX))
RETURNS VARBINARY(MAX)
AS EXTERNAL NAME [BlobCompression].[UserDefinedFunctions].[fn_decompress];
Similar to the compression function, you can pass a VARBINARY(MAX) parameter to this function and it will return the decompressed BLOB.
Usage Example
Let’s see how we can use these compression and decompression functions in a practical scenario:
CREATE TABLE dbo.MyTable (
id INT IDENTITY(1,1) PRIMARY KEY,
data VARBINARY(MAX)
);
-- Inserting a BLOB into the table
INSERT INTO dbo.MyTable (data)
VALUES (CONVERT(VARBINARY(MAX), 'This is a sample BLOB'));
-- Compressing the BLOB
UPDATE dbo.MyTable
SET data = dbo.fn_compress(data);
-- Decompressing the BLOB
SELECT CONVERT(VARCHAR(MAX), dbo.fn_decompress(data))
FROM dbo.MyTable;
In this example, we create a table with a VARBINARY(MAX) column to store the BLOB data. We insert a sample BLOB into the table, compress it using the compression function, and then retrieve the decompressed data using the decompression function.
Limitations
It’s important to note that the code included in this article allows for column-level compression in SQL Server 2005, but it lacks functions for consistency checks and may not work well with large objects (5 MB or more, depending on configuration). It is intended to demonstrate how to use CLR integration in SQL Server to extend the engine’s functionality and provide an overview of what can be done with the new BLOB/CLOB data types.
Overall, the introduction of these new data types and the ability to use CLR procedures and functions have greatly improved the manipulation of BLOBs and CLOBs in SQL Server 2005. Developers now have more flexibility and control over these large objects, making it easier to work with them in their applications.