VBA: Deleting Duplicate Rows

  • Hi,


    I'm have an excel sheet with information (e.g. Name, Address, CSZ), and I would like to remove duplicate rows from the sheet. I was wondering how to write some code to complete this task. Please respond.



    Thank You



    Chaim bochner:)

  • Hi,


    Try the following:


    Sub DuplicateRowDelete()
    For k = 1 To 3500
    Let x = ActiveCell

    ActiveCell.Offset(1, 0).Range("A1").Select
    Let y = ActiveCell
    If y = "" Then
    GoTo TestforBlank
    End If
    If x = y Then
    Selection.EntireRow.Delete
    ActiveCell.Offset(-1, 0).Range("A1").Select

    End If
    Next k


    TestforBlank:
    End Sub

  • Be careful with Jame's macro, as it will only compare the values in column A:A to see if there's a match between THAT CELL and its next row neighbor before deleting the following row.


    I'll post some new code shortly.


    Chris

  • This will get you started. Be sure to check out the assumptions that I've started with. Change the end value for RowCounter to run on the hundreds or thousands of rows you need to examine.


    TEST IT FIRST! I haven't had the time to debug it that I would like, but it's a start.


    Option Explicit
    Dim ThisRow As Integer
    Dim ThatRow As Integer
    Dim ColCounter As Integer
    Dim RowCounter As Integer


    Sub DuplicateRowDelete()


    'This example assumes that:
    ' 1. The sheet is sorted ... we'll not be looking for duplicates that are not neighboring rows
    ' 2. There is data in every cell in Column A:A ... or the process stops.
    ' 3. We'll look at cells from column A until first blank cell, and compare to next row down;
    ' if the next row down has ADDITIONAL cells of data, those won't be compared.


    For RowCounter = 1 To 10
    ColCounter = 1
    If IsEmpty(Cells(RowCounter, ColCounter)) Then Exit Sub
    Do While Not IsEmpty(Cells(RowCounter, ColCounter))
    If Cells(RowCounter, ColCounter) <> Cells(RowCounter + 1, ColCounter) Then GoTo NextRow:
    ColCounter = ColCounter + 1
    Loop
    Cells(RowCounter, 1).EntireRow.Delete
    RowCounter = RowCounter - 1


    NextRow:
    Next RowCounter
    End Sub

  • When it comes to deleting duplicates, I always prefer to select all duplicates and then check before deleting. Just a s aprecaution that no important data is removed.



    Sub Select_Dupes()
    'Roy Cox
    'J & R Excel Solutions
    'http://www.excel-it.com


    Dim n As Range
    [a:a].AdvancedFilter Action:=xlFilterInPlace, Unique:=True
    Set n = [a:a].SpecialCells(xlCellTypeVisible)
    ActiveSheet.ShowAllData
    n.EntireRow.Hidden = True
    [a:a].SpecialCells(xlCellTypeVisible).EntireRow.Select
    [a:a].EntireRow.Hidden = False
    End Sub

  • Unfortunately I couldn't get Roy's macro to work. (I'm in Excel 97, if that makes a difference.)


    I did improve my earlier code, and I can give you a metric on it, too. It ran out the dups in a 1000 row x 5 column sheet in under a minute. (It's still slow, because I can't figure out how to do this in an array yet. I wouldn't mind learning that, if anyone wants to improve this code.)


    I also have another macro that sorts all the data, from right to left and top to bottom. (So if you have 200 columns of significant data and need to sort the whole sheet, you don't have to do it three columns at a time.) Just ask, if that's of any use.


    Option Explicit


    Sub DuplicateRowDelete()


    Dim ThisRow As String, NextRow As String
    Dim ColCounter As Integer
    Dim RowCounter As Single
    Dim ColsToCheck As Single


    'This example assumes that:
    ' 1. The sheet is sorted ... we'll not be looking for duplicates that are not in neighboring rows
    ' 2. There is data in every cell in Column A:A ... or the process stops when the first empty cell is found in A:A.
    ' 3. User can specify how many columns of significance to check.


    StartHere:
    On Error Resume Next
    ColsToCheck = InputBox("How many columns should be checked? 0 - 255", "Columns to Check", 255)
    If ColsToCheck &gt; 255 Or ColsToCheck <= 0 Then GoTo StartHere


    Application.ScreenUpdating = False
    For RowCounter = ActiveCell.Row To 65000
    ' If the first cell in the row is blank, then exit
    If IsEmpty(Cells(RowCounter, 1)) Then Exit Sub

    ThisRow = ""
    NextRow = ""
    For ColCounter = 1 To ColsToCheck
    ThisRow = ThisRow & Cells(RowCounter, ColCounter).Value & "."
    NextRow = NextRow & Cells(RowCounter + 1, ColCounter).Value & "."
    If ThisRow <> NextRow Then GoTo NextRowNow
    If ColCounter = ColsToCheck And ThisRow = NextRow Then
    Cells(RowCounter, 1).EntireRow.Delete
    RowCounter = RowCounter - 1
    End If
    Next ColCounter
    NextRowNow:
    Next RowCounter


    Application.ScreenUpdating = True
    End Sub

  • You could also try looking up soundex on the web, this converts text into a numerical representation of the phonetic sound of the text. Thus you can compare text that sounds the same (i.e. overcomes some spelling mistakes). Here's a Function that returns the soundex code:-


    Function SoundExPro(test As String) As String
    Dim Result As String, C As String * 1
    Dim Location As Integer
    test = UCase(test)
    'First character (letter)
    If Asc(Left(test, 1)) < 65 Or Asc(Left(test, 1)) &gt; 90 Then
    SoundExPro = ""
    Exit Function
    Else
    'Convert St. to Saint
    If Left(test, 3) = "ST." Then
    test = "SAINT" & Mid(test, 4)
    End If
    'Convert letters to their appropriate digits
    'A,E,I,O,U,Y ("slash letters") to slashes
    'H,W, and everything else to zero-length string
    Result = Left(test, 1)
    For Location = 2 To Len(test)
    Result = Result & IDLetter(Mid(test, Location, 1))
    Next Location
    'Remove double letters
    Location = 2
    Do While Location < Len(Result)
    If Mid(Result, Location, 1) = Mid(Result, Location + 1, 1) Then
    Result = Left(Result, Location) & Mid(Result, Location + 2)
    Else
    Location = Location + 1
    End If
    Loop
    'If 1st letter equals 2nd, remove 2nd character
    If IDLetter(Left(Result, 1)) = Mid(Result, 2, 1) Then
    Result = Left(Result, 1) & Mid(Result, 3)
    End If
    'Remove "/"
    For Location = 2 To Len(Result)
    If Mid(Result, Location, 1) = "/" Then
    Result = Left(Result, Location - 1) & Mid(Result, Location + 1)
    End If
    Next
    'Trim or fill with zeroes
    Select Case Len(Result)
    Case 4
    SoundExPro = Result
    Case Is < 4
    SoundExPro = Result & String(4 - Len(Result), "0")
    Case Is &gt; 4
    SoundExPro = Left(Result, 4)
    End Select
    End If
    End Function


    Private Function IDLetter(C) As String
    ' Returns a SoundExPro code for a letter
    Select Case True
    Case C Like "[AEIOUY]"
    IDLetter = "/"
    Case C Like "[BPFV]"
    IDLetter = "1"
    Case C Like "[CSKGJQXZ]"
    IDLetter = "2"
    Case C Like "[DT]"
    IDLetter = "3"
    Case C = "L"
    IDLetter = "4"
    Case C Like "[MN]"
    IDLetter = "5"
    Case C = "R"
    IDLetter = "6"
    Case Else 'H, W, spaces, punctuation
    IDLetter = ""
    End Select
    End Function



    John

Participate now!

Don’t have an account yet? Register yourself now and be a part of our community!