De-Dupifying Record Sets in SQL Server 2005

Have you ever encountered a situation where you needed to remove duplicate rows from a record set in SQL Server? It can be a challenging task, especially when dealing with multiple duplicates or ragged numbers. In this article, we will explore a solution using the ROW_NUMBER() function and common table expressions in SQL Server 2005.

Let’s start by understanding the problem. Imagine you have a table called EMPLOYEE with columns EMPID, FNAME, LNAME, and REFDATE. This table contains duplicate rows, and you want to reduce them to a single row while keeping the earliest or latest duplicate record.

To demonstrate the solution, we will create an EMPLOYEE table in the TEMPDB database. Here is the SQL code to create and populate the table:

USE [TEMPDB]
GO

IF EXISTS (SELECT * FROM SYS.OBJECTS WHERE OBJECT_ID = OBJECT_ID(N'[DBO].[EMPLOYEE]') AND TYPE IN (N'U'))
    DROP TABLE [DBO].[EMPLOYEE]
GO

CREATE TABLE EMPLOYEE (
    EMPID INT,
    FNAME VARCHAR(50),
    LNAME VARCHAR(50),
    REFDATE DATETIME
)

INSERT INTO EMPLOYEE (EMPID, FNAME, LNAME, REFDATE)
VALUES (2021110, 'MICHAEL', 'POLAND', '2006-01-18 09:41:08.297')

-- Insert more rows...

Now that we have our EMPLOYEE table set up, let’s use the ROW_NUMBER() function to assign a unique row number to each row, ordered by EMPID and REFDATE. This will help us identify the duplicates. Here is the SQL code to achieve this:

SELECT ROW_NUMBER() OVER (ORDER BY EMPID ASC, REFDATE ASC) AS ROWID, *
FROM EMPLOYEE

The result will be a new column ROWID that represents the row number for each row. We can now use this information to de-dupify the record set. Here is the SQL code to accomplish that:

WITH [EMPLOYEE ORDERED BY ROWID] AS (
    SELECT ROW_NUMBER() OVER (ORDER BY EMPID ASC, REFDATE ASC) AS ROWID, *
    FROM EMPLOYEE
    WHERE 1 = 1
)
DELETE FROM [EMPLOYEE ORDERED BY ROWID]
WHERE ROWID NOT IN (
    SELECT MIN(ROWID)
    FROM [EMPLOYEE ORDERED BY ROWID]
    GROUP BY EMPID, FNAME, LNAME
)

SELECT ROW_NUMBER() OVER (ORDER BY EMPID ASC, REFDATE ASC) AS ROWID, *
FROM EMPLOYEE

In the above code, we first create a common table expression called [EMPLOYEE ORDERED BY ROWID] that contains the row number and all columns from the EMPLOYEE table. We then delete the rows from this common table expression that have duplicate row numbers, keeping only the earliest record for each set of duplicates.

After the deletion, we can select the final record set to see the changes. The result will be a de-dupified record set with only the unique rows remaining.

It’s important to note that the use of semi-colons in the SQL code is crucial, especially when working with common table expressions. SQL Server 2005 requires the use of semi-colons to separate certain queries. Failure to include them may result in syntax errors.

In conclusion, the ROW_NUMBER() function and common table expressions in SQL Server 2005 provide a powerful solution for de-dupifying record sets. By leveraging these features, you can efficiently remove duplicate rows and retain the desired records based on your criteria. Give it a try and see how it simplifies your data manipulation tasks!

Click to rate this post!

[Total: 0 Average: 0]

Comprehensive 360 Degree Assessment

Data Replication

Performance Optimization

Data Security

Database Migration

Expert Consultation

Cloud Migration Made Easy

Considering a move to the cloud? Axial SQL brings you proven migration strategies to streamline your transition. Our expert team ensures a smooth, efficient shift, keeping your data safe and accessible. Start your journey to the cloud with confidence!

SQL Performance Optimization

Is your SQL running slower than expected? Don't let sluggish performance hinder your business. Our optimization experts at Axial SQL specialize in tuning your databases for peak performance. Speed up your SQL and supercharge your data processing today!

Database Stability Solutions

Tired of frequent database outages? Discover stability with Axial SQL! Our comprehensive analysis identifies and resolves your database vulnerabilities. Enhance reliability, reduce downtime, and keep your operations running smoothly with our expert guidance.

Expert Database Team Evaluation

Questioning your database team's efficiency? Let Axial SQL provide an expert, unbiased analysis. We assess your team's strategies and workflows, offering insights and improvements to boost productivity. Elevate your database management to new heights!

Data Security Assurance

Concerned about your database security? Axial SQL is here to fortify your data defenses. Our specialized security assessments identify potential risks and implement robust protections. Keep your sensitive data secure and your peace of mind intact with our expert services.

Published on

De-Dupifying Record Sets in SQL Server 2005

Let's work together