Understanding SQL Server Filter Statistics

Continuous learning is essential to stay up-to-date with SQL Server product features. During a recent demo on SQL Server Statistics, I discovered a powerful feature called Filter Statistics, introduced in SQL Server 2008. This feature addresses situations where the optimizer struggles to estimate query results accurately due to a lack of understanding of data correlation.

To demonstrate the power of Filter Statistics, I created a small demo using two tables: Employee and Sales. Here is the script:

USE MASTER
GO

IF DB_ID('DropMeAfterDemo') IS NOT NULL
BEGIN
	ALTER DATABASE DropMeAfterDemo SET SINGLE_USER WITH ROLLBACK IMMEDIATE
	DROP DATABASE DropMeAfterDemo
END
GO

CREATE DATABASE DropMeAfterDemo
GO

SET NOCOUNT ON
GO

USE DropMeAfterDemo
GO

IF OBJECT_ID('Employee') IS NOT NULL
	DROP TABLE Employee
GO

IF OBJECT_ID('Sales') IS NOT NULL
	DROP TABLE Sales
GO

CREATE TABLE Employee (
	EmpID INT,
	name NVARCHAR(100)
)
GO

CREATE TABLE Sales (
	EmpID INT,
	SalesAmount INT,
	DATETIME DATETIME
)
GO

CREATE CLUSTERED INDEX cidx_EmpID ON Employee (EmpID)
GO

CREATE INDEX ix_Employee_name ON Employee (name)
GO

CREATE STATISTICS ix_Employee_EmpID_name ON Employee (EmpID, name)
GO

CREATE CLUSTERED INDEX ix_Sales_id_amount ON Sales (EmpID, SalesAmount)
GO

-- Inserting data with skewness
INSERT Employee VALUES (1, 'Pinal')
INSERT Employee VALUES (2, 'Nupur')
GO

SET NOCOUNT ON

-- Pinal did poorly with just one sale
INSERT Sales VALUES (1, 100, GETDATE())

-- Nupur did exceptionally well with 5000 sales
DECLARE @loop INT
SET @loop = 1

WHILE @loop <= 5000
BEGIN
	INSERT Sales VALUES (2, @loop, GETDATE() - RAND() * 1000)
	SET @loop = @loop + 1
END
GO

-- Updating statistics with full scan
UPDATE STATISTICS Employee WITH FULLSCAN
GO

UPDATE STATISTICS Sales WITH FULLSCAN
GO

-- Running the query and examining the execution plan
-- Enable the actual execution plan in SQL Server Management Studio
DBCC FREEPROCCACHE
GO

SELECT SalesAmount
FROM Employee, Sales
WHERE Employee.EmpID = Sales.EmpID
	AND name = 'Pinal'
GO

DBCC FREEPROCCACHE
GO

SELECT SalesAmount
FROM Employee, Sales
WHERE Employee.EmpID = Sales.EmpID
	AND name = 'Nupur'
GO

-- Creating filtered statistics based on the WHERE clause
CREATE STATISTICS Employee_stats_EmpID ON Employee (EmpID) WHERE name = 'Pinal'
GO

CREATE STATISTICS Employee_stats_EmpID2 ON Employee (EmpID) WHERE name = 'Nupur'
GO

-- Running the same queries with filtered statistics
SELECT SalesAmount
FROM Employee, Sales
WHERE Employee.EmpID = Sales.EmpID
	AND name = 'Pinal'
GO

SELECT SalesAmount
FROM Employee, Sales
WHERE Employee.EmpID = Sales.EmpID
	AND name = 'Nupur'
GO

-- Viewing filtered statistics
SELECT *
FROM sys.stats
WHERE filter_definition IS NOT NULL

In the demo, I created two tables: Employee and Sales. These tables represent a company with two employees who have different sales amounts, resulting in skewed data. To ensure accurate statistics, I updated them with a full scan.

Next, I ran a query to retrieve the sales amount for a specific employee. However, the estimated number of rows in the execution plan was incorrect because SQL Server couldn’t determine the matching rows until it found the Employee ID from the Employee table.

To address this issue, I created filtered statistics based on the WHERE clause in the query. By creating statistics specific to each employee, the optimizer could accurately estimate the number of rows and generate a more efficient query plan.

Running the same queries with the filtered statistics resulted in accurate estimations and different query plans for each employee.

The main usage of Filter Statistics is in situations where data is skewed, such as when certain values occur more frequently than others. By creating filtered statistics, SQL Server can make better decisions during query optimization.

Have you ever encountered a situation where Filter Statistics could have improved query performance in your environment? Let me know in the comments!

Click to rate this post!

[Total: 0 Average: 0]

Comprehensive 360 Degree Assessment

Data Replication

Performance Optimization

Data Security

Database Migration

Expert Consultation

Cloud Migration Made Easy

Considering a move to the cloud? Axial SQL brings you proven migration strategies to streamline your transition. Our expert team ensures a smooth, efficient shift, keeping your data safe and accessible. Start your journey to the cloud with confidence!

SQL Performance Optimization

Is your SQL running slower than expected? Don't let sluggish performance hinder your business. Our optimization experts at Axial SQL specialize in tuning your databases for peak performance. Speed up your SQL and supercharge your data processing today!

Database Stability Solutions

Tired of frequent database outages? Discover stability with Axial SQL! Our comprehensive analysis identifies and resolves your database vulnerabilities. Enhance reliability, reduce downtime, and keep your operations running smoothly with our expert guidance.

Expert Database Team Evaluation

Questioning your database team's efficiency? Let Axial SQL provide an expert, unbiased analysis. We assess your team's strategies and workflows, offering insights and improvements to boost productivity. Elevate your database management to new heights!

Data Security Assurance

Concerned about your database security? Axial SQL is here to fortify your data defenses. Our specialized security assessments identify potential risks and implement robust protections. Keep your sensitive data secure and your peace of mind intact with our expert services.

Published on

Understanding SQL Server Filter Statistics

Let's work together