Why and How to Update SQL Server Statistics
Understanding the inner workings of any relational database management system (RDBMS) involves comprehending how it makes decisions about the best way to execute queries. A pivotal player in this decision-making process in SQL Server is its use of statistics. On this note, let’s delve into the crucial subject of why and how to update SQL Server statistics to maintain the efficiency and speed of databases.
What are SQL Server Statistics?
SQL Server statistics are a collection of statistical information about the distribution of values in one or more columns of your data tables or indexed views. This information is used by the SQL Server Query Optimizer to estimate the number of rows (also known as cardinality) that will be returned by parts of a query, dictating the most efficient method for query execution.
The accuracy of statistics is paramount because they directly affect the execution plan chosen. An execution plan with poor choices could result in unnecessary IO, increased CPU usage, and ultimately decreased performance. Regularly updating statistics ensures that SQL Server has current information about the distribution of data within your tables and can make the best decisions for query execution.
Why Update SQL Server Statistics?
Up-to-Date Information: As your database evolves with inserts, updates, and deletes, data distribution changes. The outdated statistics may lead the Query Optimizer astray, relying on old data that doesn’t accurately reflect the current state of affairs.
Performance Optimization: Keeping statistics up-to-date optimizes the performance of the query execution plan, ensuring faster query processing and minimizing resource use.
Automatic and Manual Updates: SQL Server can update statistics automatically, depending on your database settings, but even with this, there are situations in which you should manually update statistics. For large databases, manual and selective statistics updates can offer optimized control.
How SQL Server Updates Statistics Automatically
Auto Update Statistics: SQL Server has an ‘AUTO_UPDATE_STATISTICS’ option that triggers an automatic update of the statistics whenever it detects changes in data distribution that reach a certain threshold. The ANSI standard for this threshold before SQL Server 2014 was when approximately 20% of the data has been modified. However, this changed with SQL Server 2014, which introduced a more sophisticated algorithm that examines the total number of rows modified.
Avalanche Pages for Big Tables: In SQL Server 2016 (13.x) SP1 CU2 and SQL Server 2017 (14.x) CU3, Microsoft introduced a change for the auto-update as the row modification threshold increases for tables as they grow in size, which was termed as ‘avalanche pages’. This helps reduce the frequency of statistical updates on large tables while improving optimizer stability.
When You Should Manually Update Statistics
There are scenarios where relying solely on the automatic update can be inadequate. For instance:
- If rapid data modifications take place in a short period, auto-updates may not trigger quickly enough to keep pace with the changes, especially for large tables.
- For highly transactional databases, the statistical changes might not reach the threshold, leading to delays in auto-updates and potential performance issues.
- Before running large batch processes or after bulk data loading operations where significant data distribution changes are expected.
- When upgrading the database engine or applying significant data modeling changes that could affect the data distribution patterns, statistics may need to be manually refreshed.
How to Manually Update Statistics
Manually updating SQL Server statistics is a straightforward maintenance task that can be performed using SQL commands or through the SQL Server Management Studio (SSMS). Here’s how:
- Using Transact-SQL (T-SQL): With a simple t-SQL command, you can update statistics for a specific table, index, or all tables within the database. For example:
UPDATE STATISTICS sales.salesOrderDetail
Updates all statistics for the salesOrderDetail table within the sales schema.
- Through SQL Server Management Studio (SSMS): Within SSMS, navigate to the database, expand the ‘Tables’ folder, right-click a table, select ‘Properties’, go to the ‘Statistics’ page, and then choose the specific statistics to update.
- Selective Update: You have the option to selectively update one statistic at a time, a collection of statistics on specific tables or indexes, or the entire database’s statistics if necessary. This tailored approach can be incredibly beneficial in large databases where wholesale statistical updates can be very resource-intensive.
- Scheduled Updates: To take the guesswork out of when to update statistics, you can schedule them during off-peak hours to avoid performance impact on regular operations. This is usually done through SQL Server Agent jobs.
- Using Maintenance Plans: SQL Server also provides the option to create maintenance plans that automate the process of updating statistics as per a defined schedule.
Updating statistics with a full scan versus a sampled scan: When updating statistics, SQL Server can perform either a full scan of the data or a sampled scan.
A full scan reads all the data in the table or index and creates the most accurate statistics but can be resource-intensive. A sampled scan reads a subset of the data, which is less accurate but faster and less resource-hungry. Assess your performance requirements and resource availability to select the most appropriate scan type for your scenario.
Monitoring the Health of SQL Server Statistics
After understanding the updating, it’s equally important to monitor the health of your statistics. You can use Dynamic Management Views (DMVs) such as
sys.dm_db_stats_properties
to determine when statistics were last updated or how many rows were sampled. Regular monitoring can prevent performance deterioration and keep your queries running smoothly.
In conclusion, maintaining SQL Server statistics is a fundamental aspect of database administration. By understanding when and how to update and monitor statistics, you can ensure that your SQL Server instance keeps humming along efficiently—supporting your data-driven applications with the performance they require.
Remember to consider your specific environment and use cases when planning the maintenance of your SQL Server statistics. Regular maintenance, combined with careful monitoring, will go a long way toward ensuring optimal query performance and database health.