Previously, I wrote an article about an interesting observation regarding the order of result sets without using the ORDER BY clause in SQL Server. The response from my readers was overwhelming, with many requesting more information on the subject. As I am also fascinated by this topic, I have decided to delve deeper into it in this article.
For those who haven’t read my previous article, I recommend taking a look at it before proceeding, as it provides some background information. You can find the article here.
Let’s explore three examples that illustrate how the Query Optimizer works with indexes when they are forced on a query.
Example 1: Default Index Usage
USE AdventureWorks;
SELECT ContactID FROM Person.Contact;
SELECT * FROM Person.Contact;
In this example, the first query’s query cost (relative to the batch) is much lower than the second query’s query cost. Since no query hint is specified, the SQL Server Query Optimizer can use any index it deems optimal for performance. It is worth noting that even though the ContactID, which is the Primary Key (PK) of the table, is the only column retrieved in the first query, it does not use the Primary Key Clustered Index. Instead, it uses a non-clustered index. This may come as a surprise to many users, as it is commonly believed that when selecting columns that are part of the PK without any additional conditions or joins, the PK clustered index would be used. However, this is not always the case. In the second query, where all columns are retrieved, the clustered index on the PK is used.
Example 2: Forcing Primary Key Clustered Index
USE AdventureWorks;
SELECT ContactID FROM Person.Contact WITH (INDEX(PK_Contact_ContactID));
SELECT * FROM Person.Contact WITH (INDEX(PK_Contact_ContactID));
In this example, we are explicitly using the Primary Key Clustered Index to retrieve data. From the actual execution plan, it is evident that retrieving only one column or all columns creates the same execution plan. The query cost for both queries is equal.
Example 3: Forcing Non-Clustered Index
USE AdventureWorks;
SELECT ContactID FROM Person.Contact WITH (INDEX(AK_Contact_rowguid));
SELECT * FROM Person.Contact WITH (INDEX(AK_Contact_rowguid));
In this example, we are using a non-clustered index to retrieve data. From the actual execution plan, it is clear that retrieving only one column is significantly faster than retrieving all columns from the table. The query cost for the first query is much lower than the query cost for the second query.
From the above examples, we can observe that query optimization is a complex game with its own rules. The Query Optimizer does not always use the clustered index to retrieve data, and sometimes a non-clustered index provides optimal performance for retrieving the Primary Key. When selecting all rows and columns, the Primary Key should be used to select data as it provides optimal performance.
There are many different queries that can be created using the concepts discussed in this article. I encourage my readers to participate and provide their opinions in the comments section below. While I understand that I haven’t gone into great depth to explain the concept, I believe I have provided a good enough idea of what I am trying to convey. I welcome reader participation through ideas, blogs, comments, and emails related to this article.