Database developers often come across situations where they need to convert a comma-separated value or other delimited items into a tabular format in SQL Server. This can be achieved using various methods, such as user-defined functions or built-in functions like string_split(). In this article, we will explore different approaches to accomplish this task.
Why is comma-separated input required?
Executing a program in an iteration using a loop is a common method in both the back-end application and the database. When dealing with multiple iterations, it can lead to longer processing time. To overcome this, we can execute the same program with a single execution by parsing comma or delimiter-separated values as an input parameter in the database and converting them into a tabular format in the T-SQL program.
Moreover, in micro-service architecture, where databases communicate using integer references, comma-separated values provide a convenient way to handle multiple inputs to the same parameter in a stored procedure or tabular function.
Using the split function with a loop
In earlier versions of SQL Server, developers used user-defined functions with a loop to extract data from comma-separated values. This approach involves iterating through the input string and extracting each value one by one. However, this method can be inefficient when dealing with large input parameters.
Here is an example of a table-valued function called split_string that uses a loop to split the input string:
CREATE FUNCTION split_string ( @in_string VARCHAR(MAX), @delimiter VARCHAR(1) ) RETURNS @list TABLE ( tuple VARCHAR(100) ) AS BEGIN WHILE LEN(@in_string) > 0 BEGIN INSERT INTO @list (tuple) SELECT LEFT(@in_string, CHARINDEX(@delimiter, @in_string + ',') - 1) AS tuple SET @in_string = STUFF(@in_string, 1, CHARINDEX(@delimiter, @in_string + @delimiter), '') END RETURN END
You can then use this function to split a comma-separated string into rows:
SELECT * FROM split_string('1001,1002,1003,1004', ',')
Using XML logic to split values
Another efficient way to split delimited values is by using XML logic. This approach involves converting the input parameter string to XML and then extracting the values using XQUERY. This method eliminates the need for a loop or cursor, resulting in better performance.
Here is an example of splitting a comma-separated string using XML:
DECLARE @user_ids NVARCHAR(MAX) = N'203616, 198667, 193718, 188769, 183820, 178871, 173922, 168973, 164024, 159075, 154126, 149177, 144228, 139279, 134330, 129381, 124432, 119483, 114534, 109585, 104636, 99687, 94738, 89789, 84840, 79891, 74942, 69993, 65044, 60095, 55146' DECLARE @sql_xml XML = CAST('' + REPLACE(@user_ids, ',', '') + ' ' AS XML) SELECT f.x.value('.', 'BIGINT') AS user_id INTO #users FROM @sql_xml.nodes('/root/U') f(x) SELECT * FROM #users
In this example, the input string is converted to XML by replacing the commas with XML tags. The resulting XML is then used in XQUERY to extract the values into a table format.
Handling combination of characters in a delimiter-separated string
What if the input parameter contains a combination of two values with multiple separators? In such cases, you can use a subquery to extract the values twice. Here is an example:
DECLARE @in_string VARCHAR(MAX) = '1021|203616$1021|198667$1022|193718$1022|188769$' DECLARE @sql_xml XML = CAST('' + REPLACE(@in_string, '$', '') + ' ' AS XML) SELECT X.Y.value('(U)[1]', 'VARCHAR(20)') AS role_id, X.Y.value('(U)[2]', 'VARCHAR(20)') AS user_id FROM ( SELECT CAST('' + REPLACE(f.x.value('.', 'VARCHAR(MAX)'), '|', '') + ' ' AS XML) AS xml_ FROM @sql_xml.nodes('/root/U') f(x) WHERE f.x.value('.', 'VARCHAR(MAX)') <> '' ) T OUTER APPLY T.xml_.nodes('root') AS X(Y)
In this example, we have used a combination of pipe “|” and dollar “$” as delimiters. The input parameter is split into two columns, role_id and user_id, which are extracted separately in the table result set.
By following these methods, you can easily convert any delimited string into a tabular format in SQL Server. Whether you choose to use a loop or XML logic depends on your specific requirements and performance considerations.