Have you ever encountered a situation where you have duplicate rows in your SQL Server table and you need to remove them? In this article, we will explore a quick and efficient method to delete duplicate rows using CTE (Common Table Expression) and ROW_NUMBER() feature in SQL Server.
Let’s start by creating a sample table with duplicate records:
CREATE TABLE DuplicateRecordTable (Col1 INT, Col2 INT)
INSERT INTO DuplicateRecordTable
SELECT 1, 1
UNION ALL
SELECT 1, 1 --duplicate
UNION ALL
SELECT 1, 1 --duplicate
UNION ALL
SELECT 1, 2
UNION ALL
SELECT 1, 2 --duplicate
UNION ALL
SELECT 1, 3
UNION ALL
SELECT 1, 4
Now, if we query the table, we can see that it contains 7 records, out of which 3 are duplicate records:
SELECT *
FROM DuplicateRecordTable
Next, we will use CTE to generate the same table with an additional column, which is the row number. This will help us identify the duplicate rows:
WITH CTE (Col1, Col2, DuplicateCount)
AS
(
SELECT Col1, Col2,
ROW_NUMBER() OVER(PARTITION BY Col1, Col2 ORDER BY Col1) AS DuplicateCount
FROM DuplicateRecordTable
)
DELETE
FROM CTE
WHERE DuplicateCount > 1
After executing the delete command, we will be left with only 4 records, effectively removing the duplicate rows:
SELECT *
FROM DuplicateRecordTable
This method is not only efficient but also works for SQL Server 2005 and later versions. It utilizes the power of CTE and ROW_NUMBER() to identify and delete duplicate rows.
So, the next time you encounter duplicate rows in your SQL Server table, give this method a try and see how it simplifies the process of removing duplicates.