When it comes to SQL Server, one concept that often causes confusion is statistics. Many people have misconceptions about how statistics work and when they are created. In this article, we will explore the topic of statistics in SQL Server and clarify some common misunderstandings.
One common misconception is that statistics only exist on a column if there is an index on that column or if statistics are explicitly created on it. However, the truth is that statistics can exist on a table even if there is no index present. If the auto-create and/or auto-update statistics feature is enabled for a SQL Server database, statistics will be automatically created on columns based on certain conditions.
Let’s take a look at an example to observe how statistics are automatically created:
USE AdventureWorks
GO
ALTER DATABASE AdventureWorks SET AUTO_CREATE_STATISTICS ON;
GO
-- Create Table
CREATE TABLE StatsTable (
ID INT,
FirstName VARCHAR(100),
LastName VARCHAR(100),
City VARCHAR(100)
)
GO
-- Insert One Hundred Thousand Records
INSERT INTO StatsTable (ID, FirstName, LastName, City)
SELECT TOP 100000
ROW_NUMBER() OVER (ORDER BY a.name) RowID,
'Bob',
CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name) % 2 = 1 THEN 'Smith'
ELSE 'Brown' END,
CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name) % 10 = 1 THEN 'New York'
WHEN ROW_NUMBER() OVER (ORDER BY a.name) % 10 = 5 THEN 'San Marino'
WHEN ROW_NUMBER() OVER (ORDER BY a.name) % 10 = 3 THEN 'Los Angeles'
ELSE 'Houston' END
FROM sys.all_objects a
CROSS JOIN sys.all_objects b
GO
/* Now Check the statistics on the Table
As the table is just created there should not be any statistics on it
and will display "This object does not have any statistics or indexes."
*/
sp_helpstats 'StatsTable', 'ALL'
GO
In the example above, we first enable the auto-create statistics setting for the AdventureWorks database. Then, we create a table called StatsTable and insert 100,000 records into it. After checking the statistics on the table, we can see that no statistics have been created yet.
Next, we run a few queries on the table:
SELECT *
FROM StatsTable
WHERE ID = 110
GO
SELECT *
FROM StatsTable
WHERE City = 'Houston'
GO
After running these queries, we check the statistics on the table again:
sp_helpstats 'StatsTable', 'ALL'
GO
Now, we can see that two different statistics have been created on the respective columns used in the WHERE clauses of the queries.
Furthermore, if we run a query with multiple columns in the WHERE clause:
SELECT *
FROM StatsTable
WHERE ID = 110 AND City = 'Houston' AND FirstName = 'Bob'
GO
After running this query, we check the statistics on the table once again:
sp_helpstats 'StatsTable', 'ALL'
GO
As expected, a new statistics object is created for the column used in the WHERE clause if it was not created earlier.
From this example, it is clear that if the auto-update statistics setting is enabled in the database, SQL Server will automatically create the necessary statistics based on the columns where certain conditions apply.
Understanding statistics in SQL Server is crucial for optimizing query performance. By having accurate and up-to-date statistics, the query optimizer can make better decisions when creating query execution plans. It is important to keep the auto-create and auto-update statistics settings enabled for databases where performance is a priority.
Remember, statistics play a vital role in SQL Server’s query optimization process, and having a good understanding of how they work can greatly improve the performance of your database.
That’s all for this article. We hope you found it helpful in understanding SQL Server statistics. Stay tuned for more informative articles on SQL Server!