Published on

November 1, 2011

Exploring SQL Server Analytical Functions: LEAD() and LAG()

SQL Server 2012 introduces new analytical functions called LEAD() and LAG(). These functions allow us to access data from subsequent and previous rows in the same result set without the need for self-joins. In this blog post, we will explore these functions and understand how they can be used.

Let’s start with a small example using the AdventureWorks sample database. We will use the following query:

USE AdventureWorks
GO

SELECT s.SalesOrderID, s.SalesOrderDetailID, s.OrderQty,
       LEAD(SalesOrderDetailID) OVER (ORDER BY SalesOrderDetailID) AS LeadValue,
       LAG(SalesOrderDetailID) OVER (ORDER BY SalesOrderDetailID) AS LagValue
FROM Sales.SalesOrderDetail s
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
ORDER BY s.SalesOrderID, s.SalesOrderDetailID, s.OrderQty
GO

The above query will give us a result set that shows the values from the next row (using LEAD) and the previous row (using LAG). This functionality would typically require a self-join, but with these functions, it becomes much simpler.

We can also specify an offset to access rows further away from the current row. For example:

USE AdventureWorks
GO

SELECT s.SalesOrderID, s.SalesOrderDetailID, s.OrderQty,
       LEAD(SalesOrderDetailID, 2) OVER (ORDER BY SalesOrderDetailID) AS LeadValue,
       LAG(SalesOrderDetailID, 2) OVER (ORDER BY SalesOrderDetailID) AS LagValue
FROM Sales.SalesOrderDetail s
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
ORDER BY s.SalesOrderID, s.SalesOrderDetailID, s.OrderQty
GO

In the above query, we have specified an offset of 2. This means that the LEAD and LAG functions will access the row that is two rows away from the current row. The first two rows in the LEAD function and the last two rows in the LAG function will return NULL values because there are not enough rows available.

If we want to replace these NULL values with a default value, we can pass a third parameter to the LEAD and LAG functions. For example:

USE AdventureWorks
GO

SELECT s.SalesOrderID, s.SalesOrderDetailID, s.OrderQty,
       LEAD(SalesOrderDetailID, 2, 0) OVER (ORDER BY SalesOrderDetailID) AS LeadValue,
       LAG(SalesOrderDetailID, 2, 0) OVER (ORDER BY SalesOrderDetailID) AS LagValue
FROM Sales.SalesOrderDetail s
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
ORDER BY s.SalesOrderID, s.SalesOrderDetailID, s.OrderQty
GO

In the above query, we have replaced the NULL values with the default value of 0. This can be useful when dealing with missing or incomplete data.

Just like any other analytic function, we can also partition the LEAD and LAG functions. This allows us to perform calculations within specific groups of data. For example:

USE AdventureWorks
GO

SELECT s.SalesOrderID, s.SalesOrderDetailID, s.OrderQty,
       LEAD(SalesOrderDetailID) OVER (PARTITION BY SalesOrderID ORDER BY SalesOrderDetailID) AS LeadValue,
       LAG(SalesOrderDetailID) OVER (PARTITION BY SalesOrderID ORDER BY SalesOrderDetailID) AS LagValue
FROM Sales.SalesOrderDetail s
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
ORDER BY s.SalesOrderID, s.SalesOrderDetailID, s.OrderQty
GO

In the above query, we have partitioned the data by the SalesOrderID column. This means that the LEAD and LAG functions will only consider the rows within each partition when calculating the result. This can be useful when we want to analyze data within specific groups.

In conclusion, the LEAD() and LAG() functions in SQL Server 2012 provide a convenient way to access data from subsequent and previous rows without the need for self-joins. These functions can be used to perform various calculations and analysis on your data. In future blog posts, we will explore more advanced concepts and compare these functions to self-joins.

Click to rate this post!
[Total: 0 Average: 0]

Let's work together

Send us a message or book free introductory meeting with us using button below.