Published on

August 19, 2009

Finding Gaps in Sequence in SQL Server

Have you ever encountered a situation where you need to find the gaps in a sequence of values in SQL Server? In this article, we will discuss different approaches to achieve this.

Recently, a user named Enrico asked a question about generating a SQL script that would show the gaps in a sequence. Let’s take a look at the question and the solutions provided by experts Brian and Tejas.

Enrico’s question:

INV0096
INV0097
INV0099
INV0100
INV0102
INV0103

Enrico wanted to generate a SQL script that would show the gaps in the sequence, such as INV0098 and INV0101, or even just the numbers 98 and 101.

Brian Tkatch provided two solutions to tackle this problem. Let’s take a look at the first solution:

WITH Data(Datum) AS
(
SELECT 'INV0096' UNION ALL
SELECT 'INV0097' UNION ALL
SELECT 'INV0099' UNION ALL
SELECT 'INV0100' UNION ALL
SELECT 'INV0102' UNION ALL
SELECT 'INV0103'
),
CTE AS
(
SELECT
CAST(SUBSTRING(MIN(Datum), 4, 4) AS INT) Start,
CAST(SUBSTRING(MAX(Datum), 4, 4) AS INT) Finish
FROM
Data
UNION ALL
SELECT
Start + 1,
Finish
FROM
CTE
WHERE
Start < Finish
)
SELECT
Common.Formatted
FROM
CTE
CROSS APPLY(SELECT 'INV' + 
RIGHT('0000' + CAST(Start AS VARCHAR(4)), 4)) Common(Formatted)
WHERE
NOT EXISTS
(
SELECT
*
FROM
Data
WHERE
Data.Datum = Common.Formatted
)
OPTION
(MAXRECURSION 0);

Brian’s first solution uses a recursive common table expression (CTE) to generate a sequence of numbers based on the minimum and maximum values in the data. It then checks for the existence of each number in the data and returns the missing values.

Brian also provided a second solution, which is similar to the first one but without using the CROSS APPLY:

WITH Data(Datum) AS
(
SELECT 'INV0096' UNION ALL
SELECT 'INV0097' UNION ALL
SELECT 'INV0099' UNION ALL
SELECT 'INV0100' UNION ALL
SELECT 'INV0102' UNION ALL
SELECT 'INV0103'
),
CTE AS
(
SELECT
CAST(SUBSTRING(MIN(Datum), 4, 4) AS INT) Start,
CAST(SUBSTRING(MAX(Datum), 4, 4) AS INT) Finish
FROM
Data
UNION ALL
SELECT
Start + 1,
Finish
FROM
CTE
WHERE
Start < Finish
)
SELECT
'INV' + RIGHT('0000' + CAST(Start AS VARCHAR(4)), 4)
FROM
CTE
WHERE
NOT EXISTS
(
SELECT
*
FROM
Data
WHERE
Data.Datum = 'INV' + RIGHT('0000' + CAST(Start AS VARCHAR(4)), 4)
)
OPTION
(MAXRECURSION 0);

Tejas Shah also provided a solution using a table variable:

DECLARE @Test TABLE (Data VARCHAR(10))
INSERT INTO @test
SELECT 'INV0096' UNION ALL
SELECT 'INV0097' UNION ALL
SELECT 'INV0099' UNION ALL
SELECT 'INV0100' UNION ALL
SELECT 'INV0102' UNION ALL
SELECT 'INV0103'
UNION ALL
SELECT 'INV0106' UNION ALL
SELECT 'INV0110'

;WITH cte1 AS(
SELECT CAST(RIGHT(Data,4) AS INT) AS RowID
FROM @Test
), Missing AS(
SELECT MIN(RowID) AS MissNum,
MAX(RowID) AS MaxID
FROM Cte1
UNION ALL
SELECT MissNum + 1,
MaxID
FROM Missing
WHERE MissNum < MaxID
)
SELECT missnum
FROM Missing
LEFT JOIN cte1 tt ON tt.Rowid = Missing.MissNum
WHERE tt.Rowid IS NULL
OPTION (MAXRECURSION 0);

Tejas’ solution uses a recursive CTE with a table variable to find the missing values in the sequence.

These solutions provide different approaches to finding gaps in a sequence in SQL Server. Depending on your specific requirements and the size of your data, you can choose the solution that best suits your needs.

Do you know any other tricks to achieve the same result? Share your thoughts in the comments below!

Click to rate this post!
[Total: 0 Average: 0]

Let's work together

Send us a message or book free introductory meeting with us using button below.