The SQL GROUP BY clause is a powerful tool that allows you to group data together and perform various calculations on the grouped data. In this article, we will explore different examples of how to use the GROUP BY clause in SQL Server.
Grouping Data and Getting Sum of Values
One common use of the GROUP BY clause is to calculate the sum of values in a column. For example, if we want to calculate the total freight cost for each order date, we can use the following query:
SELECT [OrderDate], SUM([Freight]) as Freight
FROM [Sales].[SalesOrderHeader]
GROUP BY [OrderDate]This query will return the order date and the sum of freight for each order date.
Grouping Data with Multiple Columns
When using the GROUP BY clause, it is important to include all non-aggregated columns in the GROUP BY statement. For example, if we want to calculate the sum of freight for each order date and ship date, we need to include both columns in the GROUP BY clause:
SELECT [OrderDate], [ShipDate], SUM([Freight]) as Freight
FROM [Sales].[SalesOrderHeader]
GROUP BY [OrderDate], [ShipDate]If we omit any non-aggregated column from the GROUP BY clause, we will get an error.
Using Multiple Aggregate Functions
You can also use multiple aggregate functions in a single query. For example, if we want to calculate the sum of freight and the average tax amount for each order date, we can use the following query:
SELECT [OrderDate], SUM([Freight]) as Freight, AVG(TaxAmt) as TaxAmt
FROM [Sales].[SalesOrderHeader]
GROUP BY [OrderDate]This query will return the order date, the sum of freight, and the average tax amount for each order date.
Filtering Grouped Data with HAVING
In addition to using the GROUP BY clause, we can also filter the grouped data using the HAVING clause. The HAVING clause allows us to specify conditions that must be met by the grouped data. For example, if we only want to include order dates with a total freight cost greater than 400, we can use the following query:
SELECT [OrderDate], SUM([Freight]) as Freight
FROM [Sales].[SalesOrderHeader]
GROUP BY [OrderDate]
HAVING SUM([Freight]) > 400This query will return the order date and the sum of freight for order dates with a total freight cost greater than 400.
Grouping Data with Expressions
You can also use expressions in the GROUP BY clause. This allows you to group data based on calculated values. For example, if we want to group data by the date part of the order date, we can use the following query:
SELECT FORMAT(OrderDate,'yyyy.mm.dd') as OrderDate, SUM([Freight]) as Freight
FROM [Sales].[SalesOrderHeader]
GROUP BY FORMAT(OrderDate,'yyyy.mm.dd')
HAVING SUM([Freight]) > 400This query will return the formatted order date and the sum of freight for each formatted order date.
Sorting Grouped Data with ORDER BY
To display the grouped data in a specific order, you can use the ORDER BY clause. This allows you to arrange the results in ascending or descending order. For example, if we want to sort the results by the sum of freight in descending order, we can use the following query:
SELECT FORMAT(OrderDate,'yyyy.mm.dd') as OrderDate, SUM([Freight]) as Freight
FROM [Sales].[SalesOrderHeader]
GROUP BY FORMAT(OrderDate,'yyyy.mm.dd')
HAVING SUM([Freight]) > 400
ORDER BY SUM([Freight]) DESCThis query will return the formatted order date and the sum of freight, sorted by the sum of freight in descending order.
Grouping Data with ROLLUP
The ROLLUP option of the GROUP BY clause allows you to create different combinations of columns and generate subtotals. For example, if we want to create different combinations of order date and status, including NULL values, we can use the following query:
SELECT OrderDate, [Status], SUM([Freight]) as Freight
FROM [Sales].[SalesOrderHeader]
GROUP BY ROLLUP (OrderDate, [Status])
HAVING SUM([Freight]) > 400This query will return different combinations of order date and status, along with the sum of freight for each combination.
Grouping Data with CUBE
The CUBE option of the GROUP BY clause allows you to create groups of all possible combinations of column values. For example, if we want to show the results of all combinations of order date and status, including total summary values, we can use the following query:
SELECT OrderDate, [Status], SUM([Freight]) as Freight
FROM [Sales].[SalesOrderHeader]
GROUP BY CUBE (OrderDate, [Status])
HAVING SUM([Freight]) > 400This query will return all possible combinations of order date and status, along with the sum of freight for each combination.
Grouping Data with GROUPING SETS
The GROUPING SETS option of the GROUP BY clause allows you to group data using results from both ROLLUP and CUBE. For example, if we want to group data by order date and status using both ROLLUP and CUBE, we can use the following query:
SELECT OrderDate, [Status], SUM([Freight]) as Freight
FROM [Sales].[SalesOrderHeader]
GROUP BY GROUPING SETS (ROLLUP(OrderDate,[Status]), CUBE(OrderDate,[Status]))
HAVING SUM([Freight]) > 400This query will return the grouped data using both ROLLUP and CUBE, along with the sum of freight for each group.
In conclusion, the SQL GROUP BY clause is a powerful tool that allows you to group data together and perform calculations on the grouped data. By understanding how to use the GROUP BY clause and its various options, you can effectively analyze and summarize data in your SQL Server database.