Published on

August 1, 2011

Partitioning with the Over Clause (Part 2)

Yesterday we learned how the over clause can be used to compare your number against the overall aggregated number for an entire result set. Sometimes you might want your number to be compared against its category and not all records from a table.

For example, let’s say you are a wrestler and you want to compare your ranking against wrestlers in your city rather than the entire country or state. By partitioning the sport down into the city instead of the entire country or state, your ranking is much better. Similarly, businesses might have other reasons for partitioning a group.

Let’s use an example of schools where they see a drop in attendance during the summer time. Comparing an attendance trend from spring to summer would not show an uptrend even if your school is growing year after year. It makes more sense to compare this summer to last summer to find a trend, rather than comparing the spring and summer where you always see a drop.

Whatever your reason, you can get a total by a level of your choosing with the OVER() clause. Let’s try an example using the aggregate function COUNT() with the OVER() clause.

We’ll start with a simple query on the Employee table. Here we see that Alex Adams is just 1 employee out of 13. In fact, each employee is one of 13 in this example.

Now let’s look at Alex Adams. He works in Location 1. How many of JProCo’s employees are in Location 1 (Seattle)? And how many of the total employees work in Boston?

Alongside the existing table records, we want to add a column showing the count of employees at each location. We must include an argument inside the parentheses. PARTITION BY divides the result set into partitions. In this case, we’ve added a PARTITION BY LocationID. Similar to the behavior of GROUP BY, the command to PARTITION BY LocationID takes the LocationID for each employee and counts the number of records in that group which shares the same LocationID.

Let’s run the query and check the results. Our result set shows the count of employees at each location. The Seattle employees show a count of 7, the Boston employees have a count of 3, and the Spokane records have a count of 2. John Marshbank shows a count of 1, since he’s the only one with a null LocationID.

Since Seattle has 7 of the total 13 employees, we can estimate that just over 50% of JProCo employees work in Seattle (LocationID 1).

Now let’s calculate the percentage of products in each product category using the T-SQL statement. We have a table named CurrentProducts with a column named Category.

Which query should you use?

SELECT DISTINCT Category, COUNT(*) OVER (PARTITION BY Category)*100/ COUNT(*) OVER() as PctCategory FROM CurrentProducts
SELECT DISTINCT Category, COUNT(*) OVER (PARTITION BY Category)*100.0/ COUNT(*) OVER() as PctCategory FROM CurrentProducts
SELECT DISTINCT Category, COUNT(*) OVER()*100.0/ COUNT(*) OVER(PARTITION BY Category) as PctCategory FROM CurrentProducts

Please leave your answer in the comment section below with the correct option, explanation, and your country of residence. Every day, one winner will be announced from the United States and one winner will be announced from India.

Good luck!

Click to rate this post!
[Total: 0 Average: 0]

Let's work together

Send us a message or book free introductory meeting with us using button below.