When working with SQL Server, there may be times when you need to combine data from multiple tables or views into one comprehensive dataset. This can be done using the UNION and UNION ALL operators. While these operators serve a similar purpose, there are important differences between them that you should be aware of.
UNION Operator
The UNION operator allows you to join multiple datasets into one dataset and removes any duplicate rows. It performs a DISTINCT operation across all columns in the result set. Here are some rules to keep in mind when using the UNION operator:
- Each query must have the same number of columns.
- Each column must have compatible data types.
- Column names for the final result set are taken from the first query.
- ORDER BY and COMPUTE clauses can only be issued for the overall result set and not within each individual result set.
- GROUP BY and HAVING clauses can only be issued for each individual result set and not for the overall result set.
If you don’t have the exact same columns in all queries, you can use a default value or a NULL value to fill in the missing columns.
UNION ALL Operator
The UNION ALL operator also allows you to join multiple datasets into one dataset, but it does not remove any duplicate rows. This makes the process faster, but if you want to eliminate duplicate records, you should use the UNION operator instead.
Examples
Let’s take a look at a few simple examples to understand how these operators work and how they differ.
Example 1: UNION ALL
In this example, we are using the UNION ALL operator to combine data from the Employee table in the AdventureWorks database. The query is as follows:
SELECT * FROM HumanResources.Employee
UNION ALL
SELECT * FROM HumanResources.Employee
UNION ALL
SELECT * FROM HumanResources.Employee
When this query is run, the result set will contain 870 rows, as the data is simply concatenated one dataset on top of the other.
Example 2: UNION
In this example, we are using the UNION operator to combine data from the Employee table again. The query is as follows:
SELECT * FROM HumanResources.Employee
UNION
SELECT * FROM HumanResources.Employee
UNION
SELECT * FROM HumanResources.Employee
When this query is run, the result set will contain 290 rows. The UNION operator removes duplicate records and returns only the unique rows.
Conclusion
Understanding the differences between the UNION and UNION ALL operators in SQL Server is crucial when you need to combine data from multiple tables or views. The UNION operator removes duplicate rows, while the UNION ALL operator does not. By using these operators appropriately, you can create comprehensive datasets that meet your specific requirements.