Finding missing intervals in a sequence of continuous numbers can be a challenging task, even for advanced programmers. In this article, we will explore a new solution using SQL Server’s geometry data type and specific functions designed for working with geometry objects.
Let’s consider the following sequence of numbers: 1, 2, 7, 8, 9, 10, 30. Our goal is to compute the missing intervals, which in this case are [3, 6] and [11, 29].
To transform each integer into a range, we can use the formula [Integer-1, Integer]. We can then represent each range as a spatial polygon using the POLYGON function. For example, the number 30 becomes the polygon POLYGON((30 0, 30 1, 29 1, 29 0, 30 0)).
We can store these polygons in a table using the geometry data type:
DECLARE @tab TABLE(Col1 GEOMETRY); INSERT @tab(Col1) SELECT GEOMETRY::STGeomFromText('POLYGON((1 0, 1 1, 0 1, 0 0, 1 0))', 0) -- 1 UNION ALL SELECT GEOMETRY::STGeomFromText('POLYGON((2 0, 2 1, 1 1, 1 0, 2 0))', 0) -- 2 UNION ALL SELECT GEOMETRY::STGeomFromText('POLYGON((7 0, 7 1, 6 1, 6 0, 7 0))', 0) -- 7 UNION ALL SELECT GEOMETRY::STGeomFromText('POLYGON((8 0, 8 1, 7 1, 7 0, 8 0))', 0) -- 8 UNION ALL SELECT GEOMETRY::STGeomFromText('POLYGON((9 0, 9 1, 8 1, 8 0, 9 0))', 0) -- 9 UNION ALL SELECT GEOMETRY::STGeomFromText('POLYGON((10 0, 10 1, 9 1, 9 0, 10 0))', 0) -- 10 UNION ALL SELECT GEOMETRY::STGeomFromText('POLYGON((30 0, 30 1, 29 1, 29 0, 30 0))', 0); -- 30
Next, we can merge these polygons into intervals of continuous values using the UnionAggregate function. This will result in a MULTIPOLYGON spatial geometry object:
DECLARE @pol GEOMETRY; DECLARE @ranges GEOMETRY; SELECT @pol = GEOMETRY::UnionAggregate(t.Col1) FROM @tab t; SELECT @pol, @pol.STAsText();
The resulting MULTIPOLYGON object represents the combined intervals:
MULTIPOLYGON ( ((29 0, 30 0, 30 1, 29 1, 29 0)), ((6 0, 7 0, 8 0, 9 0, 10 0, 10 1, 9 1, 8 1, 7 1, 6 1, 6 0)), ((0 0, 1 0, 2 0, 2 1, 1 1, 0 1, 0 0)) )
To compute the missing intervals, we can generate the envelope (bounding box) for these polygons using the STEnvelope() function. We then compute the difference between the envelope and the combined intervals using the STSymDifference() function:
SELECT @ranges = @pol.STEnvelope().STSymDifference(@pol); SELECT @ranges, @ranges.STAsText();
The result of STSymDifference() is a MULTIPOLYGON object containing the missing polygons from the envelope:
MULTIPOLYGON ( ((10 0, 29 0, 29 1, 10 1, 10 0)), ((2 0, 6 0, 6 1, 2 1, 2 0)) )
By using the STGeometryN() function, we can extract individual missing intervals from the MULTIPOLYGON object:
SELECT Range_N = val.number, Range_Polygon = @ranges.STGeometryN(val.number).STEnvelope().ToString(), Range_Start = @ranges.STGeometryN(val.number).STEnvelope().STPointN(1).STX + 1, Range_End = @ranges.STGeometryN(val.number).STEnvelope().STPointN(2).STX FROM master.dbo.spt_values val WHERE val.type = N'P' AND val.number >= 1 AND val.number <= @ranges.STNumGeometries();
In conclusion, this article presents a new solution for finding missing intervals in sequences of continuous integers using SQL Server’s spatial objects and functions. By leveraging the power of geometry data type and specific functions like UnionAggregate(), STEnvelope(), STSymDifference(), STNumGeometries(), and STGeometryN(), we can efficiently compute missing intervals in a sequence of numbers.
For more information on SQL Server concepts and techniques, be sure to check out our other articles on our website.