Yesterday, we discussed two analytical functions, FIRST_VALUE and LAST_VALUE, in SQL Server. After publishing the blog post, I received an interesting question from a reader. They pointed out that in the first example, the FIRST_VALUE remains the same, but the LAST_VALUE changes for each line. They suggested that the LAST_VALUE should be the highest value in the window or set of results. This question highlights a common misconception, but there is no bug in the code. Let me explain further.
Before we proceed, I recommend reading yesterday’s blog post as this question is related to it. Now, let’s dive into a fun query:
USE AdventureWorks
GO
SELECT s.SalesOrderID, s.SalesOrderDetailID, s.OrderQty,
FIRST_VALUE(SalesOrderDetailID) OVER (ORDER BY SalesOrderDetailID) AS FstValue,
LAST_VALUE(SalesOrderDetailID) OVER (ORDER BY SalesOrderDetailID) AS LstValue
FROM Sales.SalesOrderDetail s
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
ORDER BY s.SalesOrderID, s.SalesOrderDetailID, s.OrderQty
GO
The above query will give us the following result:
SalesOrderID | SalesOrderDetailID | OrderQty | FstValue | LstValue |
---|---|---|---|---|
43663 | 1 | 1 | 1 | 1 |
43667 | 2 | 2 | 1 | 2 |
43669 | 3 | 3 | 1 | 3 |
43670 | 4 | 4 | 1 | 4 |
As per the reader’s question, the value of the LAST_VALUE function should always be 114 and not increase as the rows increase. Let me rewrite the code with some additional T-SQL syntax. Please pay special attention to the ROWS clause that I have added:
USE AdventureWorks
GO
SELECT s.SalesOrderID, s.SalesOrderDetailID, s.OrderQty,
FIRST_VALUE(SalesOrderDetailID) OVER (ORDER BY SalesOrderDetailID ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS FstValue,
LAST_VALUE(SalesOrderDetailID) OVER (ORDER BY SalesOrderDetailID ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS LstValue
FROM Sales.SalesOrderDetail s
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
ORDER BY s.SalesOrderID, s.SalesOrderDetailID, s.OrderQty
GO
Now, let’s check the result of the updated query. The result is the same as before because the default ROWS selection in the OVER clause is always UNBOUNDED PRECEDING AND CURRENT ROW. If you want the maximum value of the window using the OVER clause, you need to change the syntax to UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING for the ROWS clause.
Let’s run the following query and pay special attention to the ROWS clause again:
USE AdventureWorks
GO
SELECT s.SalesOrderID, s.SalesOrderDetailID, s.OrderQty,
FIRST_VALUE(SalesOrderDetailID) OVER (PARTITION BY SalesOrderID ORDER BY SalesOrderDetailID ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS FstValue,
LAST_VALUE(SalesOrderDetailID) OVER (PARTITION BY SalesOrderID ORDER BY SalesOrderDetailID ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS LstValue
FROM Sales.SalesOrderDetail s
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
ORDER BY s.SalesOrderID, s.SalesOrderDetailID, s.OrderQty
GO
The resultset of the above query is what the questioner was asking for:
SalesOrderID | SalesOrderDetailID | OrderQty | FstValue | LstValue |
---|---|---|---|---|
43663 | 1 | 1 | 1 | 4 |
43667 | 2 | 2 | 2 | 4 |
43669 | 3 | 3 | 3 | 4 |
43670 | 4 | 4 | 4 | 4 |
So, in simple words, there is no bug, but there is additional syntax needed to get your desired answer. The same logic also applies to the PARTITION BY clause when used. Here is a quick example of how we can further partition the query by SalesOrderDetailID using these new functions:
USE AdventureWorks
GO
SELECT s.SalesOrderID, s.SalesOrderDetailID, s.OrderQty,
FIRST_VALUE(SalesOrderDetailID) OVER (PARTITION BY SalesOrderID ORDER BY SalesOrderDetailID ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS FstValue,
LAST_VALUE(SalesOrderDetailID) OVER (PARTITION BY SalesOrderID ORDER BY SalesOrderDetailID ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS LstValue
FROM Sales.SalesOrderDetail s
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
ORDER BY s.SalesOrderID, s.SalesOrderDetailID, s.OrderQty
GO
The above query will give us a windowed resultset on SalesOrderDetailID and provide us with the FIRST and LAST values for the windowed resultset. There is much more to discuss about these two functions, and we have only scratched the surface. In future posts, I will explore them further in depth.