Published on

October 14, 2011

Understanding SQL Server Columnstore Indexes

In SQL Server, there are two types of storage for databases: Row Store and Column Store. Row store stores data in rows on a page, while column store stores data in columns on the same page. The main difference between the two is how they handle searches. With row store, a query searches all the data in a row, whether it is relevant or not. On the other hand, column store queries only need to search a smaller number of columns, resulting in faster search speeds and more efficient use of hard drive space.

Column store indexes, which are powered by Microsoft’s VertiPaq technology, offer significant advantages in terms of search speed and memory usage. These indexes are heavily compressed, leading to faster searches and greater memory efficiency. However, it’s important to note that column store indexes are read-only. Once you add a column store index to a table, you cannot delete, insert, or update the data. This limitation is not a major issue for data warehousing scenarios, where column store indexes are commonly used. Partitioning can be used to avoid the need for rebuilding the index.

Creating a column store index is straightforward and does not require learning a new syntax. You simply need to specify the keyword “COLUMNSTORE” when creating the index. The data is entered as usual, but stored in separate sets of disk pages for each column. This approach allows only the columns needed to solve a query to be fetched from disk, reducing the amount of data that needs to be retrieved. Additionally, the redundant data in a single column helps further compress the data, resulting in a positive effect on the buffer hit rate.

Let’s take a look at a small example to understand how a columnstore index can improve query performance on a large table:

USE AdventureWorks
GO

-- Create a new table
CREATE TABLE [dbo].[MySalesOrderDetail](
[SalesOrderID] [int] NOT NULL,
[SalesOrderDetailID] [int] NOT NULL,
[CarrierTrackingNumber] [nvarchar](25) NULL,
[OrderQty] [smallint] NOT NULL,
[ProductID] [int] NOT NULL,
[SpecialOfferID] [int] NOT NULL,
[UnitPrice] [money] NOT NULL,
[UnitPriceDiscount] [money] NOT NULL,
[LineTotal] [numeric](38, 6) NOT NULL,
[rowguid] [uniqueidentifier] NOT NULL,
[ModifiedDate] [datetime] NOT NULL
) ON [PRIMARY]
GO

-- Create a clustered index
CREATE CLUSTERED INDEX [CL_MySalesOrderDetail] ON [dbo].[MySalesOrderDetail]([SalesOrderDetailID])
GO

-- Create sample data table
-- WARNING: This query may take several minutes to run based on your system's resources
INSERT INTO [dbo].[MySalesOrderDetail]
SELECT S1.*
FROM Sales.SalesOrderDetail S1
GO 100

-- Performance test
-- Comparing regular index with columnstore index
USE AdventureWorks
GO

SET STATISTICS IO ON
GO

-- Select table with regular index
SELECT ProductID, SUM(UnitPrice) SumUnitPrice, AVG(UnitPrice) AvgUnitPrice,
SUM(OrderQty) SumOrderQty, AVG(OrderQty) AvgOrderQty
FROM [dbo].[MySalesOrderDetail]
GROUP BY ProductID
ORDER BY ProductID
GO

-- Table 'MySalesOrderDetail'. Scan count 1, logical reads 342261, physical reads 0, read-ahead reads 0.

-- Create columnstore index
CREATE NONCLUSTERED COLUMNSTORE INDEX [IX_MySalesOrderDetail_ColumnStore]
ON [MySalesOrderDetail](UnitPrice, OrderQty, ProductID)
GO

-- Select table with columnstore index
SELECT ProductID, SUM(UnitPrice) SumUnitPrice, AVG(UnitPrice) AvgUnitPrice,
SUM(OrderQty) SumOrderQty, AVG(OrderQty) AvgOrderQty
FROM [dbo].[MySalesOrderDetail]
GROUP BY ProductID
ORDER BY ProductID
GO

As you can see from the results, the query performs extremely fast after creating the columnstore index. The number of pages it needs to read is drastically reduced because the columns needed for the query are stored on the same page. This eliminates the need to go through every single page to retrieve the required data. By comparing the execution plans, it becomes evident that the columnstore index outperforms the regular index in this case.

It’s important to note that columnstore indexes may not always be the appropriate solution for every scenario. In future posts, we will explore cases where columnstore indexes are not suitable and discuss other tips and tricks related to columnstore indexes.

Stay tuned for more SQL Server insights!

Click to rate this post!
[Total: 0 Average: 0]

Let's work together

Send us a message or book free introductory meeting with us using button below.