Functions on indexed columns in predicates :

We often tend to write code as a direct translation of given request. For example, if we are asked to retrieve all customers whose name starts with the letter L, it feels very natural to write the query like this, using the LEFT function to return the first character of their name:

SELECT customer_name
FROM Customers
WHERE LEFT(customer_name, 1) = 'L';


Alternatively, if we are asked to calculate the total sales for January 2009, we might write a query like the following, which uses the DATEPART function to extract the relevant month and year from the sale_date column:

SELECT SUM(sale_amount) AS total_sales
FROM Sales
WHERE DATEPART(YEAR, sale_date) = 2009
AND DATEPART(MONTH, sale_date) = 1;


While these queries look very intuitive, you will find that the indexes that you (of course!) have on your customer_name and sale_date columns remain unused, and that the execution plan for these queries reveal index scans.

The problem arises from the fact that the index columns are being passed to a function, which the query engine must then evaluate for every single row in the table. In cases such as these, the WHERE clause predicate is deemed "non-SARGable" and the best that the query optimizer can do is perform a full index or table scan.

To make sure the indexes get used, we need to avoid the use of functions on the indexed columns. In our two examples, it is a relatively simple task to rewrite the queries to use SARG-able predicates. The first requested can be expressed with this logically equivalent query:

SELECT customer_name
FROM Customers
WHERE customer_name LIKE 'L%';

The equivalent for the second query is as follows:

SELECT SUM(sale_amount) AS total_sales
FROM Sales
WHERE sale_date >= '20090101'
AND sale_date < '20090201';


These two queries are most likely to utilize index seek to retrieve the data quickly and efficiently.

It's worth noting that SQL Server is getting "smarter" as it evolves. For example, consider the following query, which uses the CAST function on the indexed sale_date column:

SELECT SUM(sale_amount) AS total_sales
FROM Sales
WHERE CAST(sale_date AS DATE) = '20090101';

If you run this query on SQL 2005 or earlier, you'll see an index scan. However, on SQL Server 2008 you'll see an index seek, despite the use of the CAST function. The execution plan reveals that the predicate is transformed into something like the following:

SELECT SUM(sale_amount) AS total_sales
FROM Sales
WHERE sale_date >= '20090101'
AND sale_date < '20090102';

However, in general, you should use SARGable predicates where possible, rather than rely on the evolving intelligence of the optimizer.