Good indexes are the key to good performance in SQL Server. In the previous articles, we discussed the basics of indexes and took a closer look at the clustered index. In this article, we will dive deeper into nonclustered indexes, how SQL Server uses them, and some recommendations for selecting useful nonclustered indexes.
What is a Nonclustered Index?
A nonclustered index is the second type of index in SQL Server. It has a similar structure to the clustered index, but with some differences. Unlike the clustered index, a nonclustered index does not contain the entire data row at the leaf level. Instead, it contains only the columns defined in the index and a pointer to the actual data row.
When the underlying table is a heap (has no clustered index), the pointer to the data row is the RID (Row Identifier), which consists of the File ID, Page No, and Slot index. When the underlying table has a clustered index, the pointer to the actual data row is the clustering key.
Include Columns
Columns specified as include columns are stored at the leaf level of the nonclustered index, but not at the intermediate or root levels. They are not part of the index key and do not count towards the limit on indexes. Include columns are useful for creating covering indexes, which can improve query performance. Typically, columns that appear only in the select clause of a query and not within the WHERE, FROM, or GROUP BY clauses are candidates for include columns.
Covering Indexes
A covering index is an index that contains all of the columns necessary for a specific query. It allows the query to be evaluated completely from the index, without needing to go to the base table to fetch additional columns. Creating a covering index for a query can significantly improve its performance. However, it is not always possible or desirable to cover every query, as it may lead to large indexes or a large number of indexes.
Filtered Indexes
Filtered indexes are a new feature introduced in SQL Server 2008. They allow an index to contain only a subset of the rows in the table. This can be useful when creating indexes for specific subsets of data. However, there are some limitations on the predicates that can be used in filtered indexes.
How Nonclustered Indexes are Used
SQL Server can use a nonclustered index to perform a seek operation if the query has a SARGable (Search ARGument) predicate referencing the index key or a left-based subset of the index key. The index must also be covering or return a sufficiently small number of rows to avoid expensive lookups to the clustered index or heap.
If the required lookups are considered too expensive, SQL Server will not use the nonclustered index and may resort to a scan operation instead. Nonclustered index scans are generally more efficient than clustered index scans because nonclustered indexes are usually smaller.
Considerations for Selecting Nonclustered Indexes
When selecting columns for nonclustered indexes, it is important to consider the queries that are run against the table. Indexing a column that is never used in a query is pointless.
A nonclustered index should be selective, meaning the values in the indexed column should be fairly unique and queries that filter on it should return small portions of the table. This is because key/RID lookups are expensive operations, and if an index is not selective enough, SQL Server may ignore it and perform a clustered index scan instead.
Wider nonclustered indexes are generally more useful than single column indexes. SQL Server rarely uses multiple nonclustered indexes on the same table to evaluate a query. Composite indexes, which are indexes defined over more than one column, can improve query performance by allowing the matching rows to be located in a single seek operation.
The order of the columns in an index key should be chosen based on the selectivity of the columns, the percentage of queries that filter on each column, and the type of predicates used in the queries. The most selective column should be made the leading column of the index, but this should be considered in light of other factors such as the frequency of queries filtering on that column.
In conclusion, understanding nonclustered indexes and how SQL Server uses them is crucial for optimizing query performance. By selecting the right columns, creating covering indexes, and considering the selectivity and order of columns, you can greatly improve the efficiency of your SQL Server queries.
For further information and in-depth analysis, refer to the recommended resources mentioned in the original article.