SQL Server 2012 introduces a new analytical function called PERCENTILE_DISC(). This function allows you to compute a specific percentile for sorted values in a rowset or within distinct partitions of a rowset.
Let’s break down the definition of this function. For a given percentile value P, PERCENTILE_DISC() sorts the values of the expression in the ORDER BY clause and returns the value with the smallest CUME_DIST value (with respect to the same sort specification) that is greater than or equal to P.
Simply put, the PERCENTILE_DISC() function helps you find the value of a column that is equal to or greater than a specified percentile value.
Let’s take a look at an example query:
USE AdventureWorks
SELECT SalesOrderID, OrderQty, ProductID,
CUME_DIST() OVER (PARTITION BY SalesOrderID ORDER BY ProductID) AS CDist,
PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY ProductID) OVER (PARTITION BY SalesOrderID) AS PercentileDisc
FROM Sales.SalesOrderDetail
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
ORDER BY SalesOrderID DESC
In the above query, we are using the PERCENTILE_DISC(0.5) function, which is similar to finding the median but not exactly. The function takes a percentile as a parameter and returns the value that is equal to or greater than the specified percentile value.
In this example, we are passing 0.5 as the percentile value. The function goes through the result set and identifies which rows have values that are equal to or greater than 0.5. In the first example, it found two rows with a value of 0.5, and the value of the ProductID column in those rows is the answer of the PERCENTILE_DISC() function.
Let’s explore another example where we pass 0.6 as the percentile:
USE AdventureWorks
SELECT SalesOrderID, OrderQty, ProductID,
CUME_DIST() OVER (PARTITION BY SalesOrderID ORDER BY ProductID) AS CDist,
PERCENTILE_DISC(0.6) WITHIN GROUP (ORDER BY ProductID) OVER (PARTITION BY SalesOrderID) AS PercentileDisc
FROM Sales.SalesOrderDetail
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
ORDER BY SalesOrderID DESC
In this case, the result of the PERCENTILE_DISC(0.6) function is the ProductID for which the CUME_DIST() value is greater than 0.6. For example, for SalesOrderID 43670, the row with a CUME_DIST() value of 0.75 is the qualified row, resulting in the answer 773 for the ProductID.
Understanding the PERCENTILE_DISC() function can be useful when you need to find specific percentiles within your data. It allows you to easily identify values that fall within a certain range.
Hopefully, this explanation has provided you with a clear understanding of the PERCENTILE_DISC() function in SQL Server.