Support us .Net Basics C# SQL ASP.NET Aarvi MVC Slides C# Programs Subscribe Download

Part 4 - Delete duplicate rows in sql

Suggested Videos:
Part 1 - How to find nth highest salary in sql
Part 2 - SQL query to get organization hierarchy
Part 3 - How does a recursive CTE work



In this video, we will discuss deleting all duplicate rows except one from a sql server table.



Let me explain what we want to achieve. We will be using Employees table for this demo.
Delete duplicate rows in sql

SQL Script to create Employees table
Create table Employees
(
     ID int,
     FirstName nvarchar(50),
     LastName nvarchar(50),
     Gender nvarchar(50),
     Salary int
)
GO

Insert into Employees values (1, 'Mark', 'Hastings', 'Male', 60000)
Insert into Employees values (1, 'Mark', 'Hastings', 'Male', 60000)
Insert into Employees values (1, 'Mark', 'Hastings', 'Male', 60000)
Insert into Employees values (2, 'Mary', 'Lambeth', 'Female', 30000)
Insert into Employees values (2, 'Mary', 'Lambeth', 'Female', 30000)
Insert into Employees values (3, 'Ben', 'Hoskins', 'Male', 70000)
Insert into Employees values (3, 'Ben', 'Hoskins', 'Male', 70000)
Insert into Employees values (3, 'Ben', 'Hoskins', 'Male', 70000)

The delete query should delete all duplicate rows except one. The output should be as shown below, after the delete query is executed.
Delete all duplicate rows except one in sql

Here is the SQL query that does the job. PARTITION BY divides the query result set into partitions.
WITH EmployeesCTE AS
(
   SELECT *, ROW_NUMBER()OVER(PARTITION BY ID ORDER BY ID) AS RowNumber
   FROM Employees
)
DELETE FROM EmployeesCTE WHERE RowNumber > 1

sql server interview questions and answers

15 comments:

  1. SELECT * FROM Employees

    -- Delete duplicate rows in sql

    SELECT DISTINCT ID,FirstName,LastName,Gender,Salary
    FROM
    Employees

    ReplyDelete
  2. this work also

    DELETE FROM Employees WHERE id NOT IN
    (SELECT MIN(id) FROM Employees
    GROUP BY FirstName,LastName,Gender,Salary)

    ReplyDelete
    Replies
    1. useful when id is primary key

      Delete
    2. Will not work as you cant have Min(ID) in select clause, when its not in Group By clause

      Delete
  3. check this please

    SELECT DISTINCT * INTO tblEmployee5
    FROM tblEmployee
    DROP TABLE tblEmployee
    EXEC sp_rename 'tblEmployee5', 'tblEmployee'

    ReplyDelete
  4. Hi Venkat,
    Can you pls explain how to delete rows except one based on only some column values?
    eg Employees table having FirstName, LastName, Salary & City column.
    I want to delete all rows except one with FirstName, LastName and City duplicate having salary different.
    Thanks in advance.

    ReplyDelete
  5. Run this query, it will clear a lot about Rank, DenseRank and RowNumber

    select ID, ROW_NUMBER() OVER (partition by ID Order By Id desc) as RowNumberCol,
    RANK() OVER(Order By Id desc) as RankCol,
    DENSE_RANK() OVER(Order By Id desc) as DenseRankCol
    from Employees

    ReplyDelete
  6. i want to use this query in fronnt end .in c# or asp.net forms..plz help me.how to use

    ReplyDelete
  7. Excellent .. Great .... Thank you so much

    ReplyDelete
  8. In Mysql getting error
    Table 'test.employeescte' doesn't exist

    Can anyone help me how can i delete duplicate rows in mysql ?

    ReplyDelete
  9. That's because you don't have the table "test.employeescte" . It says "doesn't exist". The table called "employeescte" belongs to author's database(as an example). So you must replace the code with your table name which has duplicated rows. Don't forget to replace "test" with your name of database.

    ReplyDelete
  10. lets think about your scenario change
    1) if there is multiple duplicate rows in table base on Name example :
    (1, 'Mark', 'Hastings', 'Male', 60000)
    (2, 'Mark', 'Hastings', 'Male', 50000)
    (3, 'Mark', 'Hastings', 'Male', 80000)
    from this duplicate rows just keep highest salary values row only and others delete?
    2) if there is multiple duplicate rows in table base on Name example :
    (1, 'Mark', 'Hastings', 'Male', 60000)
    (2, 'Mark', 'Hastings', 'Male', 60000)
    from this duplicate rows just keep latest ID values row only and delete others rows?

    ReplyDelete
    Replies
    1. with cte_emp as
      (
      select *, row_number() over (partition by firstname order by salary desc) as orderid
      from employees
      )
      delete from cte_emp where orderid > 1

      Delete
    2. with cte_emp as
      (
      select *, row_number() over (partition by firstname order by salary desc) as orderid
      from employees
      )
      delete from cte_emp where orderid > 1
      select * from employees

      Delete

It would be great if you can help share these free resources