Published on

March 22, 2015

Understanding SQL Server Filter Statistics

Continuous learning is essential to stay up-to-date with SQL Server product features. During a recent demo on SQL Server Statistics, I discovered a powerful feature called Filter Statistics, introduced in SQL Server 2008. This feature addresses situations where the optimizer struggles to estimate query results accurately due to a lack of understanding of data correlation.

To demonstrate the power of Filter Statistics, I created a small demo using two tables: Employee and Sales. Here is the script:

USE MASTER
GO

IF DB_ID('DropMeAfterDemo') IS NOT NULL
BEGIN
	ALTER DATABASE DropMeAfterDemo SET SINGLE_USER WITH ROLLBACK IMMEDIATE
	DROP DATABASE DropMeAfterDemo
END
GO

CREATE DATABASE DropMeAfterDemo
GO

SET NOCOUNT ON
GO

USE DropMeAfterDemo
GO

IF OBJECT_ID('Employee') IS NOT NULL
	DROP TABLE Employee
GO

IF OBJECT_ID('Sales') IS NOT NULL
	DROP TABLE Sales
GO

CREATE TABLE Employee (
	EmpID INT,
	name NVARCHAR(100)
)
GO

CREATE TABLE Sales (
	EmpID INT,
	SalesAmount INT,
	DATETIME DATETIME
)
GO

CREATE CLUSTERED INDEX cidx_EmpID ON Employee (EmpID)
GO

CREATE INDEX ix_Employee_name ON Employee (name)
GO

CREATE STATISTICS ix_Employee_EmpID_name ON Employee (EmpID, name)
GO

CREATE CLUSTERED INDEX ix_Sales_id_amount ON Sales (EmpID, SalesAmount)
GO

-- Inserting data with skewness
INSERT Employee VALUES (1, 'Pinal')
INSERT Employee VALUES (2, 'Nupur')
GO

SET NOCOUNT ON

-- Pinal did poorly with just one sale
INSERT Sales VALUES (1, 100, GETDATE())

-- Nupur did exceptionally well with 5000 sales
DECLARE @loop INT
SET @loop = 1

WHILE @loop <= 5000
BEGIN
	INSERT Sales VALUES (2, @loop, GETDATE() - RAND() * 1000)
	SET @loop = @loop + 1
END
GO

-- Updating statistics with full scan
UPDATE STATISTICS Employee WITH FULLSCAN
GO

UPDATE STATISTICS Sales WITH FULLSCAN
GO

-- Running the query and examining the execution plan
-- Enable the actual execution plan in SQL Server Management Studio
DBCC FREEPROCCACHE
GO

SELECT SalesAmount
FROM Employee, Sales
WHERE Employee.EmpID = Sales.EmpID
	AND name = 'Pinal'
GO

DBCC FREEPROCCACHE
GO

SELECT SalesAmount
FROM Employee, Sales
WHERE Employee.EmpID = Sales.EmpID
	AND name = 'Nupur'
GO

-- Creating filtered statistics based on the WHERE clause
CREATE STATISTICS Employee_stats_EmpID ON Employee (EmpID) WHERE name = 'Pinal'
GO

CREATE STATISTICS Employee_stats_EmpID2 ON Employee (EmpID) WHERE name = 'Nupur'
GO

-- Running the same queries with filtered statistics
SELECT SalesAmount
FROM Employee, Sales
WHERE Employee.EmpID = Sales.EmpID
	AND name = 'Pinal'
GO

SELECT SalesAmount
FROM Employee, Sales
WHERE Employee.EmpID = Sales.EmpID
	AND name = 'Nupur'
GO

-- Viewing filtered statistics
SELECT *
FROM sys.stats
WHERE filter_definition IS NOT NULL

In the demo, I created two tables: Employee and Sales. These tables represent a company with two employees who have different sales amounts, resulting in skewed data. To ensure accurate statistics, I updated them with a full scan.

Next, I ran a query to retrieve the sales amount for a specific employee. However, the estimated number of rows in the execution plan was incorrect because SQL Server couldn’t determine the matching rows until it found the Employee ID from the Employee table.

To address this issue, I created filtered statistics based on the WHERE clause in the query. By creating statistics specific to each employee, the optimizer could accurately estimate the number of rows and generate a more efficient query plan.

Running the same queries with the filtered statistics resulted in accurate estimations and different query plans for each employee.

The main usage of Filter Statistics is in situations where data is skewed, such as when certain values occur more frequently than others. By creating filtered statistics, SQL Server can make better decisions during query optimization.

Have you ever encountered a situation where Filter Statistics could have improved query performance in your environment? Let me know in the comments!

Click to rate this post!
[Total: 0 Average: 0]

Let's work together

Send us a message or book free introductory meeting with us using button below.