Recently, during one of my webcasts on statistics and performance tuning in SQL Server, I received an interesting question from an attendee. They asked if there was a way to find all the outdated statistics based on the dates from the server. In this blog post, I will explain the concept of outdated statistics and how to identify them.
What are Outdated Statistics?
The term “outdated statistics” is subjective and often misunderstood. In SQL Server, there is no specific definition for outdated statistics. Instead, statistics can become less relevant over time. The relevance of statistics is determined by the number of updates that have occurred since the last statistics update.
It’s important to note that the age of statistics is not measured in terms of days, hours, or time. Instead, it is evaluated based on the percentage of updates relative to the total number of rows in a table. For example, a table with 10,000 rows and 100 updated rows is not the same as a table with 100,000 rows and 100 updated rows. The evaluation is done based on percentages.
Updating Statistics
SQL Server has a built-in logic for updating statistics. If a table has more than 500 rows, the statistics are automatically updated when 20% of the table rows (500+ 20% of the table rows) are updated. This logic is simplified and does not cover every scenario.
If you have left the auto update or auto create statistics option enabled, you don’t need to worry about outdated statistics. SQL Server will handle the task automatically. However, if you have disabled these options, you need to manually update the statistics when it reaches the default threshold of 500+ 20% of the table rows.
Some may argue that they want the statistics to update more frequently. In such cases, you can manually update the statistics or set certain trace flags. However, it’s important to note that updating statistics more frequently does not guarantee different or improved execution plans. SQL Server always strives to build the most effective execution plan.
Identifying Outdated Statistics
To identify outdated statistics, you can use the following script:
SELECT DISTINCT
OBJECT_NAME(s.[object_id]) AS TableName,
c.name AS ColumnName,
s.name AS StatName,
STATS_DATE(s.[object_id], s.stats_id) AS LastUpdated,
DATEDIFF(d, STATS_DATE(s.[object_id], s.stats_id), GETDATE()) AS DaysOld,
dsp.modification_counter,
s.auto_created,
s.user_created,
s.no_recompute,
s.[object_id],
s.stats_id,
sc.stats_column_id,
sc.column_id
FROM sys.stats s
JOIN sys.stats_columns sc ON sc.[object_id] = s.[object_id] AND sc.stats_id = s.stats_id
JOIN sys.columns c ON c.[object_id] = sc.[object_id] AND c.column_id = sc.column_id
JOIN sys.partitions par ON par.[object_id] = s.[object_id]
JOIN sys.objects obj ON par.[object_id] = obj.[object_id]
CROSS APPLY sys.dm_db_stats_properties(sc.[object_id], s.stats_id) AS dsp
WHERE OBJECTPROPERTY(s.OBJECT_ID, 'IsUserTable') = 1
AND (s.auto_created = 1 OR s.user_created = 1)
ORDER BY DaysOld;
This script provides details about the age of the index and the number of modifications that have occurred since the last statistics update. Based on this information, you can make your own decision on whether to consider the statistics as outdated and update them accordingly.
To update statistics for the entire database, you can use the following script:
EXEC sp_updatestats;
By running this script, you can ensure that all statistics in the database are updated.
Remember, understanding and managing statistics is crucial for optimizing query performance in SQL Server. Regularly monitoring and updating statistics can greatly improve the execution plans chosen by the query optimizer.
I hope this blog post has provided you with a better understanding of outdated statistics in SQL Server. Feel free to share your thoughts and opinions in the comments section below.