On several occasions I’ve been assigned the task to split delimited string
into rows.
I’ve done this in different ways, but never thought about the performance or stability of the different approaches for doing this.
So here’s my 25 cents and findings.
My first solution to this was to code a function to traverse through the string and insert a new value to a temp table for every delimiter found in the string:
CREATE FUNCTION [dbo].[list_to_table] ( @list varchar(4000) ) RETURNS @tab TABLE ( item int ) BEGIN IF CHARINDEX(',',@list) = 0 or CHARINDEX(',',@list) is null BEGIN INSERT INTO @tab (item) VALUES (@list); RETURN; END DECLARE @c_pos INT; DECLARE @n_pos INT; DECLARE @l_pos INt; SET @c_pos = 0; SET @n_pos = CHARINDEX(',',@list,@c_pos); WHILE @n_pos > 0 BEGIN INSERT INTO @tab (item) VALUES (CAST(SUBSTRING(@list,@c_pos+1,@n_pos - @c_pos-1) as int)); SET @c_pos = @n_pos; SET @l_pos = @n_pos; SET @n_pos = CHARINDEX(',',@list,@c_pos+1); END; INSERT INTO @tab (item) VALUES (CAST(SUBSTRING(@list,@l_pos+1,4000) as int)); RETURN; END
Then I ran into performance issues with very long strings – pushing me to find a better solution with more performance. I began to look into the xml-aproach for solving the issue – and ended up with this:
declare @string as nvarchar(4000) select r.value('@value','int') as Kategori from ( select cast('<A value = "'+ replace(@string,',','"/><A value = "')+ '"/>' as xml ) as xml_str) xml_cte cross apply xml_str.nodes('/A') as x(r)
Performance for the two different solutions is shown below:
String contains all numbers from 0 to 100 with comma as delimiter, machine 4 cores 16 gb ram and ssd.
Function: 7 ms (on average)
XML: 3 ms (on average)
No matter how long a string I send to the XML code it runs around 3 ms – the function just climbs and climbs in time (naturally).
Anybody who has done the same and found an even better way to dynamically split delimited string into rows and want to share?