When writing T-SQL queries, it’s important to understand the difference between the IN
and NOT IN
operators. While they may seem similar, they behave differently when it comes to handling NULL values.
Let’s start by clarifying how these operators work:
IN
: This operator is used to specify a list of values that a column must match. It checks if the column value is equal to any of the specified values.NOT IN
: This operator is used to specify a list of values that a column must not match. It checks if the column value is not equal to any of the specified values.
Now, here’s where the difference lies when it comes to handling NULL values:
When using IN
, the query is essentially saying “WHERE myvalue = ‘A’ OR myvalue = ‘B’ OR myvalue = NULL”. In this case, NULL values won’t cause the entire statement to fail because it’s only an OR condition.
On the other hand, when using NOT IN
, the query is saying “WHERE myvalue <> ‘A’ AND myvalue <> ‘B’ AND myvalue <> NULL”. This is where the problem arises. Since NULL in SQL is an unknown value, you can’t test equality or inequality on it, resulting in no results.
Let’s take a look at a simple example to demonstrate this:
DECLARE @T TABLE (
Val varchar (5)
)
DECLARE @T2 TABLE (
Val varchar (5)
)
INSERT INTO @T(Val) SELECT 'A' UNION ALL SELECT 'B' UNION ALL SELECT 'C' UNION ALL SELECT 'D' UNION ALL SELECT 'E' UNION ALL SELECT 'F'
INSERT INTO @T2(Val) SELECT 'A' UNION ALL SELECT 'B' UNION ALL SELECT 'C' UNION ALL SELECT NULL
SET ANSI_NULLS ON
SELECT * FROM @T WHERE Val IN (SELECT * FROM @T2)
SELECT * FROM @T WHERE Val NOT IN (SELECT * FROM @T2)
SELECT * FROM @T WHERE Val NOT IN (SELECT * FROM @T2 WHERE Val IS NOT NULL)
This issue is further complicated by the ANSI_NULLS
setting. While most people have this setting turned ON, it’s important to note that it is an option and can introduce another variable into the mix. NOT IN
will not fail in the same way if you have ANSI_NULLS
set to OFF.
Understanding the difference between IN
and NOT IN
and how they handle NULL values is crucial for writing accurate and efficient SQL queries. By being aware of these nuances, you can avoid common mistakes and ensure your queries return the expected results.
For more information on this topic, you can refer to the related article by Gail Shaw: Not Exists vs Not In.