ColumnStore indexes are a powerful feature in SQL Server that can greatly improve query performance. However, there are some nuances and considerations when working with ColumnStore indexes that are important to understand. In this blog post, we will explore some frequently asked questions and provide answers to help you better utilize ColumnStore indexes in your SQL Server environment.
Did the optimizer use the ColumnStore index in the first place?
One of the first questions to ask when experiencing performance regression with ColumnStore indexes is whether the optimizer actually used the index. You can determine this by looking at the Storage property in SHOWPLAN. If the ColumnStore index was not used, you may need to investigate further to understand why.
Did the query start with batch processing and then fall back to row-based processing?
In some cases, a query may start executing using the ColumnStore index with batch processing, but then fall back to row-based processing. This can impact performance. You can identify this by looking at the EstimatedExecutionMode and ActualExecutionMode in Showplan. Even in row mode, the ColumnStore index can still provide better performance than a rowstore index. If performance regression is suspected due to the ColumnStore index, you can use a query hint to force the use of a different index.
Did you insert a FORCE ORDER hint on a join?
If you have used a FORCE ORDER hint on a join, the optimizer will obey the hint. However, if the optimizer would have used the starjoin optimization and created bitmaps, the loss of the starjoin optimization bitmaps could cause performance to be worse than without the FORCE ORDER hint.
Can the columnstore evaluate filters in the storage engine for optimization reasons?
Yes, the columnstore can evaluate some filters in the storage engine for optimization reasons. Filters such as comparisons, IN lists, and IS (NOT) NULL can be pushed down into the storage engine. However, filters on strings are not pushed down.
Can the columnstore eliminate partitions from the scan?
While partition elimination per se does not occur in the columnstore, segments (row groups) can be eliminated from the scan. Each segment has metadata that includes the min and max values in the segment. By using this metadata, segments that do not qualify can be eliminated from the scan. If a segment is eliminated, the other segments in the same rowgroup are also eliminated. You can use the Column_store_segment_eliminate extended event to see if segments are being eliminated.
Does partitioning work the same way for the columnstore index as for row store indexes?
Mostly, yes. A nonclustered columnstore index can be built on a partitioned table, and partitions can be split, merged, and switched between partitioned tables. However, there are some restrictions associated with columnstore indexes. For example, if a table has a columnstore index, two non-empty partitions cannot be merged. Additionally, a nonclustered columnstore index must always be partition-aligned with the base table.
As you can see, there are several factors to consider when working with ColumnStore indexes in SQL Server. By understanding these concepts and considerations, you can optimize your queries and improve performance in your SQL Server environment.
Out of curiosity, we would like to know how many of you are currently using ColumnStore indexes in your environments and what your experience has been like. Please share your thoughts and experiences in the comments below.