Data compression is a crucial aspect of managing databases in SQL Server. It not only helps reduce storage size but also improves performance for existing data. While data compression was initially introduced as an enterprise feature in SQL Server 2008, it is now available in the standard edition as well, starting from SQL Server 2016 SP1 and above. In SQL Server 2017 and Azure SQL Database, row and page compression are supported for rowstore tables and indexes, while columnstore and columnstore archival compression are supported for columnstore tables and indexes.
In previous versions of SQL Server, the stored procedure sp_estimate_data_compression_savings was used to analyze objects and estimate their size after compression. However, SQL Server 2019 introduces enhancements to this procedure, specifically for columnstore indexes. In this article, we will explore the benefits of these enhancements.
The syntax for sp_estimate_data_compression_savings is as follows:
sp_estimate_data_compression_savings [ @schema_name = ] 'schema_name', [ @object_name = ] 'object_name', [ @index_id = ] index_id, [ @partition_number = ] partition_number, [ @data_compression = ] 'data_compression'
Let’s take a closer look at the different data compression options available:
- NONE: No data compression is enabled.
- ROW: Data compression is applied at the row level.
- PAGE: Data compression is applied at the page level.
- COLUMNSTORE: Data compression is applied to columnstore indexes.
- COLUMNSTORE_ARCHIVE: Data compression is applied to columnstore indexes for archival purposes.
When creating a columnstore index, you can specify the data compression method to apply. There are two types of data compression available for columnstore indexes:
- COLUMNSTORE: This is the default compression option and compresses data using columnstore compression.
- COLUMNSTORE_ARCHIVE: This option further compresses data that is used infrequently. However, it requires additional system resources in terms of CPU and memory.
Prior to SQL Server 2019, sp_estimate_data_compression_savings worked differently for columnstore and columnstore_archive data compression options. It created a new columnstore index with the specified compression state and compared it with the source object to calculate the estimated size after compression. However, in SQL Server 2019, the procedure compares the source object with an equivalent columnstore object.
Here is a mapping between the source and reference objects for different compression states:
| Source Object | Reference Object |
|---|---|
| Heap or Clustered index or clustered columnstore index | Clustered columnstore index |
| Non-clustered index or non-clustered columnstore index | Non-clustered columnstore index |
It’s important to understand the output of the sp_estimate_data_compression_savings procedure. The following columns are particularly relevant:
- object_name: Represents the name of the table or index.
- size_with_current_compression_setting: Shows the current size of the table or index.
- size_with_requested_compression_setting: Displays the estimated size of the table or index with the specified compression.
- sample_size_with_current_compression_setting: Indicates the size of the current sample compression.
- sample_size_with_requested_compression_setting (KB): Represents the size of the sample created using the specified compression option.
Let’s now perform a demonstration using the sp_estimate_data_compression_savings procedure on the StockItemTransactions table in the WideWorldImporters database in SQL Server 2019.
First, verify the compatibility level of the database. If it is not set to SQL Server 2019 (150), you can change it using the following query:
USE [master] GO ALTER DATABASE [WideWorldImporters] SET COMPATIBILITY_LEVEL = 150 GO
Next, examine the existing indexes on the StockItemTransactions table using the following query:
SELECT object_name(object_id) AS object_name, type_desc, * FROM sys.indexes WHERE object_id = 638625318 --object id of StockItemTransactions table
Now, let’s explore all the compression options using the sp_estimate_data_compression_savings procedure:
Using ‘NONE’ parameter in data_compression:
USE WideWorldImporters; GO EXEC sp_estimate_data_compression_savings 'Warehouse', 'StockItemTransactions', NULL, NULL, 'NONE'; GO
Using ‘ROW’ parameter in data_compression:
USE WideWorldImporters; GO EXEC sp_estimate_data_compression_savings 'Warehouse', 'StockItemTransactions', NULL, NULL, 'ROW'; GO
Using ‘PAGE’ parameter in data_compression:
USE WideWorldImporters; GO EXEC sp_estimate_data_compression_savings 'Warehouse', 'StockItemTransactions', NULL, NULL, 'PAGE'; GO
Using ‘COLUMNSTORE’ parameter in data_compression:
USE WideWorldImporters; GO EXEC sp_estimate_data_compression_savings 'Warehouse', 'StockItemTransactions', NULL, NULL, 'COLUMNSTORE'; GO
Using ‘COLUMNSTORE_ARCHIVE’ parameter in data_compression:
USE WideWorldImporters; GO EXEC sp_estimate_data_compression_savings 'Warehouse', 'StockItemTransactions', NULL, NULL, 'COLUMNSTORE_ARCHIVE'; GO
By analyzing the results, you can observe the significant data compression achieved with columnstore indexes. This is particularly beneficial for large databases, such as data warehouses, as it leads to substantial storage and cost savings. However, it’s important to note that columnstore indexes with archive compression may perform slower and require higher CPU and memory resources. Therefore, it is recommended to use this compression method only for old data with infrequent usage.
In conclusion, the enhancements to the sp_estimate_data_compression_savings procedure in SQL Server 2019 provide valuable insights into estimating data savings for columnstore indexes. You can explore these enhancements in your environment to gain a better understanding of the benefits they offer. Stay tuned for future articles where we will cover more on columnstore index enhancements in SQL Server 2019.
Table of contents:
- Columnstore Index Enhancements – Index stats update in clone databases
- Columnstore Index Enhancements – Data compression, estimates, and savings
- Columnstore Index Enhancements – Online and offline (re)builds