Published on

December 11, 2009

Exploring Streaming CLR Table-Valued Functions in SQL Server

When Microsoft first introduced CLR integration in SQL Server 2005, it opened up new possibilities for developers to perform tasks that couldn’t be easily achieved with regular T-SQL. One of the features that caught my attention was the ability to create Table-Valued Functions (TVFs) that could stream data out to the calling process. However, I soon discovered that the traditional method of implementing TVFs required collecting all the data in memory before returning it, which was not ideal for large datasets.

After some research, I stumbled upon the concept of fully streaming TVFs using the IEnumerable interface. This approach allows for a more efficient and seamless interaction with the output data, enabling operations like WHERE conditions, GROUP BYs, and ORDER BYs. Unfortunately, finding examples and documentation on this method proved to be quite challenging.

In this article, I aim to provide a clear and easy-to-find example of how to create a fully streaming CLR TVF. Before diving into the code, it’s important to note that there are two ways to achieve full streaming: implementing IEnumerator or using the “yield” command. While the former involves creating MoveNext(), Current(), and Reset() methods, I will focus on the latter, which is less documented but equally effective.

Let’s take a look at the code:

private struct ReturnValues 
{
    public int Value;
}

private static void FillValues(object obj, out SqlInt32 TheValue) 
{
    ReturnValues ReturnVals = (ReturnValues)obj;
    TheValue = ReturnVals.Value;
}

[Microsoft.SqlServer.Server.SqlFunction(DataAccess = DataAccessKind.None,
    IsDeterministic = true, IsPrecise = true,
    SystemDataAccess = SystemDataAccessKind.None,
    FillRowMethodName = "FillValues", TableDefinition = "IntValue INT")]
public static IEnumerable TVF_Streaming(SqlInt32 MaxValue) 
{
    if (MaxValue.IsNull)
    {
        yield break; // return no rows
    }
    
    ReturnValues Vals = new ReturnValues(); // each row
    
    for (int index = 1; index <= MaxValue.Value; index++) 
    {
        Vals.Value = index;
        yield return Vals; // return row per each iteration
    }
    
    // we do not need to return everything at once
}

In this example, we define a struct called ReturnValues to hold the value for each row. The FillValues method is responsible for assigning the value to the output parameter. The TVF_Streaming function takes a parameter called MaxValue, which determines the number of rows to generate. If MaxValue is null, the function immediately exits without returning any rows.

The key to achieving full streaming is the use of the “yield return” command. Inside the for loop, we assign the current index value to the Vals struct and yield return it. This allows us to return each row as it is created, instead of collecting all the data in memory before returning it.

To test this function, you can create an Assembly containing both the TVF_Streaming function and its T-SQL wrapper function. You can find the SQL script and Visual Studio 2008 Project in the resources section at the end of this article.

Comparing the performance of the traditional TVF method (TVF_Standard) and the fully streaming method (TVF_Streaming) with large datasets, we can see a significant difference. The TVF_Standard function runs out of memory when processing 20,000,000 rows, while the TVF_Streaming function handles 40,000,000 rows without any issues.

It’s worth noting that the “yield return” approach may not work in all cases. There are known issues when using the SqlChars and SqlBytes datatypes for input parameters, which can result in errors related to accessing members of an object from the wrong thread. However, these issues have been addressed in later versions of SQL Server.

In conclusion, fully streaming CLR Table-Valued Functions offer a more efficient and flexible way to interact with large datasets in SQL Server. By using the “yield return” command, developers can avoid memory limitations and perform complex operations directly on the streaming data. While this method may not be widely documented, it is a valuable technique to have in your SQL Server toolbox.

Resources:

Thank you for reading! I hope this article has provided you with a clear understanding of how to create fully streaming CLR Table-Valued Functions in SQL Server.

Click to rate this post!
[Total: 0 Average: 0]

Let's work together

Send us a message or book free introductory meeting with us using button below.