NULLs and the NOT IN predicate
One of the most common requests is to retrieve data based on some column value not included in a list of values. The following two tables illustrate the scenario. We have tables with colors and products:
Colors table:
color
----------
Black
Blue
Green
Red
Products table:
sku product_description color
---- -------------------- ------
1 Ball Red
2 Bike Blue
3 Tent NULL
Note that these tables do not represent a perfect design, following normalization rules and best practices. Rather, it is a simplified scenario to help illustrate this example better. In reality, the colors table would most likely contain a color code key column that would be referenced in the products table.
The request is to select a list of colors that have not previously been used on products. In other words, we need to construct a query that returns only those colors for which there is no product with that color. It might seem, at first glance, that the NOT IN predicate provides a very intuitive way to satisfy this request, very close to how the problem would be stated in plain English:
SELECT C.color
FROM Colors AS C
WHERE C.color NOT IN (SELECT P.color
FROM Products AS P);
You may have been expecting this query to return two rows (for 'black' and 'green') but, in fact, it returns an empty result set:
color
----------
(0 row(s) affected)
Obviously this is 'incorrect'. What is the problem? It's simply that SQL uses three-valued logic, driven by the existence of NULL, which is not a value but a marker to indicate missing (or UNKNOWN) information. When the NOT operator is applied to the list of values from the subquery, in the IN predicate, it is translated like this:
"color NOT IN (Red, Blue, NULL)" is equivalent to "NOT(color=Red OR color=Blue OR color=NULL)"
The expression "color=NULL" evaluates to UNKNOWN and, according to the rules of three-valued logic, NOT UNKNOWN also evaluates to UNKNOWN. As a result, all rows are filtered out and the query returns an empty set.
This mistake will often surface if requirements change, and a non-nullable column is altered to allow NULLs. It also highlights the need for thorough testing. Even if, in the initial design, a column disallows NULLs, you should make sure your queries continue to work correctly with NULLs.
One solution is to use the EXISTS predicate in place of IN, since EXISTS uses two-valued predicate logic evaluating to TRUE/FALSE:
SELECT C.color
FROM Colors AS C
WHERE NOT EXISTS(SELECT *
FROM Products AS P
WHERE C.color = P.color);
color
----------
Black
Green
Other possible solutions are as follows:
/* IS NOT NULL in the subquery */
SELECT C.color
FROM Colors AS C
WHERE C.color NOT IN (SELECT P.color
FROM Products AS P
WHERE P.color IS NOT NULL);
/* EXCEPT */
SELECT color
FROM Colors
EXCEPT
SELECT color
FROM Products;
/* LEFT OUTER JOIN */
SELECT C.color
FROM Colors AS C
LEFT OUTER JOIN Products AS P
ON C.color = P.color
WHERE P.color IS NULL;
While all solutions produce the desired results, using EXCEPT may be the easiest to understand and use. Note that the EXCEPT operator returns distinct values, which works fine in our scenario but may not be correct in another situation.
Captions
- Asp.Net (9)
- ASP.Net Tips (1)
- C# (7)
- Computer (1)
- Internet Explorer (5)
- Java Script (11)
- Shortcut Keys (4)
- SQL Programming - Common Mistakes (11)
- Sql Server (22)
- Sql Server Definitions (8)
- Sql Server Test (2)
- SSRS (2)
About Me
- Vasanth.S
- I'm just normal guy who like to travel, addicted to TV and Computer, Playing Computer Games and Watching Movies and many more. Have lot of idea and quote in life
My Blog List
-
-
Pass data from one page to another using post15 years ago
Blog Archive
-
▼
2009
(63)
-
▼
November
(25)
- Display Database Table, Stored Procedure and Funct...
- Sql Server Restore
- Sql server backup
- Delete VS Truncate
- Date Methods
- Math Properties and Methods
- Basics of Sql Server
- Regular Expression Syntax
- trim(), startsWith() and endsWith() in JavaScript
- Differences bw temp tables, table variables and CTE
- Table Variables
- Global Temporary Tables
- Local Temporary Tables
- Removing Duplicates from a Table in SQL
- Common SQL Programming Mistakes - 10
- Common SQL Programming Mistakes - 3
- Common SQL Programming Mistakes - 2
- Common SQL Programming Mistakes - 9
- Common SQL Programming Mistakes - 8
- Common SQL Programming Mistakes - 7
- Common SQL Programming Mistakes - 6
- Common SQL Programming Mistakes - 5
- Common SQL Programming Mistakes - 4
- Common SQL Programming Mistakes - 1
- Ten Common SQL Programming Mistakes
-
▼
November
(25)