Partitioning a large table is a common practice in SQL Server databases for manageability and query performance benefits. While partitioning a table with a traditional rowstore index is relatively straightforward, partitioning a table with a columnstore index brings additional challenges. In this tutorial, we will cover the nuances of partitioning a table with a columnstore index and provide a detailed guide on the partitioning steps, as well as the implementation of partition merge and split using the partition switch technique.
Step 1: Create Filegroups or Use Primary Filegroup
In SQL Server, it is a good practice to create multiple filegroups for very large data warehouse tables. However, for simplicity, we will create the partitioned table in the primary filegroup. This step involves creating the necessary filegroups or using the existing primary filegroup.
Step 2: Create Partition Function
The next step is to create a partition function, which defines the boundaries for partitioning the table. In this example, we will create a partition function based on the date or datetime column. We will use the RANGE RIGHT option, which means that the boundary value is the first value in the right partition, and all data inserted into the partition will be after the partition date but before the start date of the next partition. The partition function will specify the range values for each partition.
USE PartitionDemo;
GO
CREATE PARTITION FUNCTION PF_YearlyOrder (date)
AS RANGE RIGHT FOR VALUES
('2019-01-01', '2020-01-01', '2021-01-01');
GO
Step 3: Create Partition Scheme
The partition scheme maps the logical partitions defined by the partition function to physical filegroups. In this step, we will create a partition scheme and map the partitions to the primary filegroup.
USE PartitionDemo;
GO
CREATE PARTITION SCHEME PS_YearlyOrder
AS PARTITION PF_YearlyOrder
ALL TO ([PRIMARY]);
GO
Step 4: Create Partitioned Table
Finally, we can create the partitioned table using the partition scheme we created. In this example, we will create a new table with a clustered columnstore index from the beginning. The partition scheme will be specified as the storage location for the table.
USE PartitionDemo;
GO
CREATE TABLE FactOrderDemo (
OrderDemoKey bigint IDENTITY(1,1) NOT NULL,
OrderDate date NOT NULL,
OrderQuantity int NOT NULL,
OrderType varchar(20) NOT NULL,
INDEX CIX_FactOrderDemo_OrderDate CLUSTERED COLUMNSTORE
) ON PS_YearlyOrder(OrderDate);
GO
Note that the OrderDate column has a data type of date, which matches the boundary values specified in the partition function. The partition scheme name is specified in the ON clause of the CREATE TABLE statement.
Inserting Data into the Partitioned Table
Once the partitioned table is created, we can insert data into it. In this example, we will insert some sample data for demonstration purposes.
INSERT INTO FactOrderDemo (OrderDate, OrderQuantity, OrderType)
VALUES
('2018-10-18', 108, 'Domestic'),
('2019-01-01', 204, 'Domestic'),
('2019-05-24', 198, 'International'),
('2019-10-11', 87, 'Domestic'),
('2020-01-06', 269, 'Domestic'),
('2020-09-24', 309, 'International'),
('2020-11-26', 136, 'International'),
('2021-01-09', 212, 'Domestic'),
('2021-05-24', 98, 'International');
GO
Querying the Partitioned Table
Once the data is inserted into the partitioned table, we can query it as usual. However, we can also specify which partition to select the data from using the $PARTITION system function in the WHERE clause. This allows for more efficient querying of specific partitions.
SELECT * FROM FactOrderDemo
WHERE $PARTITION.PF_YearlyOrder(OrderDate) = 1;
GO
In the above example, we are selecting all rows from the first partition. We can also select data from other partitions by changing the partition number in the WHERE clause.
Partition Merge and Split
Merging and splitting partitions are common operations for managing SQL Server partitions. However, when dealing with a partitioned table that has a columnstore index, there are some differences to be aware of.
Unlike regular rowstore partitioned tables, SQL Server does not allow you to merge or split non-empty columnstore partitions. To perform partition merge or split on a columnstore index table, you need to first get rid of the data in the involved partitions. Here is a simple and efficient workaround to perform partition merge and split in a columnstore index table:
- Create a staging table with the same structure as the main partitioned table.
- Switch the partition(s) from the main table to the staging table.
- Perform the partition merge or split in the main table, as the partition(s) in the main table will be empty after the previous step.
- Switch the partition(s) back from the staging table to the main table if necessary, or archive/truncate old data in the staging table.
By following these steps, you can effectively perform partition merge and split operations on a columnstore index table.
Partitioning a table with a columnstore index in SQL Server can provide significant benefits in terms of manageability and query performance. By understanding the steps involved and the nuances of working with columnstore indexes, you can effectively implement and manage partitioned tables in your SQL Server databases.