Remove Duplicates Of Multiple Columns

  • Hi Folks


    I have searched the many forum solutions for duplicates however the ones I found relate to a single column. However what I would like to be able to do is starting from row 6 is to remove duplicates of multiples columns of a single worksheet, namely columns M, O, P, Q and R. Note some cells in each column are blank.


    To clarify - it would firstly check for duplicates in column M, keeping just one of the rows of the duplicated value and removing entiely the row/rows of where the duplications appeared. Then go through each of the other four columns. of the worksheet doing the same for each of the other four columns.


    Then after the duplicates are removed from all five columns provide a tally at the bottom of the worksheet of how many times a value now appears in the cells of each of repective columns.


    Hope someone can help


    Regards


    Tony Mos

  • Re: Remove Duplicates Of Multiple Columns


    Make sure Columns S:X are empty and run this code on a COPY of your workbook

  • Re: Remove Duplicates Of Multiple Columns


    HI DAVE


    Thank you for your reponse.


    Your code of course worked in a fashion, perhaps I need to be clearer - in that I would like to keep the format of the original file without the rows where there had been duplicates only in the columns mentioned being M, O , P ,Q and R.


    I have made a test /demonstation file as attached - it has the format required with some duplicate values in the columns mentioned (these are coloured in this test file only for ease of demonstration - in the real file the cells where values appear are not coloured).


    In the same test file - starting at row 35 is what the resulting file should look like, exactly the same -just having removed duplicated row of only columns M, O , P ,Q and R.


    And the result also has a tally for each of these columns of how many times a value apears in each column after the duplicates have been removed see test file.


    As an after thought it would be nice if he resulting file could be output into a new worksheet - keeping the original worksheet unchanged.


    Thanks for your help ( I need it as this is beyond my level of VBA knowledge so I would be good to see how it can be done)



    Regards


    Tony Mos

  • Re: Remove Duplicates Of Multiple Columns


    Untested, but should give you the idea. Just ensure row 35 for M and O:R have headings AND no blanks directly below.

  • Re: Remove Duplicates Of Multiple Columns


    HI DAVE


    The code you suggest is almost there but doesn't quite do it - I have tried to modify but I cant get it there.


    See the test result attached


    I did change the start range to row 7 and it ran well with row 7 being the last row of headings.


    The test result shows it managed to remove some duplicates but only dropped one duplicate eg were two the same - the purple values in the test result worksheet show still two duplicates (cells q11 and q12)- in the original sheet this value was in this column three times therefore it seems the code has only dropped one of the duplicates out - so the code needs to drop more in cases where there are more than one duplicate.


    Also the test result worksheet also shows the code does not keep rows in cases where there was only one value with no duplication in the original worksheet. The output needs to also retain those rows where no duplication occured.


    is it possible at the bottom of the resulting worksheet (after duplicates removed) to provide a tally of each column M, 0, P, Q, and R of the occurance of each time a cell in each of those individual columns of the resulting worksheet?.


    Again thanks heaps for your help,...my three VBA text books just don't seem to cover this very well.


    cheers


    Tony Mos :)

  • Re: Remove Duplicates Of Multiple Columns


    Hi Dave


    Thanks for your time in helping so far


    Are you or somewone else be able to provide more help to solve my issue or could you direct me to somewhere where I might be able to get an idea what I need to do to modify the code so it deletes multiples in the five require columns and delets more than one multiple and retains values where there is no mulitples - as per my last posting comments.


    Thanks


    Regards


    Tony Mos

  • Bumping


    Hi Folks


    So from the begining .... so here is the CHALLENGE should you choose to help me.
    I have searched the many forum solutions for duplicates however the ones I found relate to a single column. However what I would like to be able to do is starting from row 6 is to remove duplicates of multiples columns of a single worksheet, namely columns M, O, P, Q and R (see first example attached to this thread. Note some cells in each column are blank in each column as per the example previously attached to this thread, also there may be more than one duplicate in a column and not always right next to the original -could be anywhere in the column. Also it is important to retain/keep values that have not been duplicated.


    To clarify - it would firstly check for duplicates in column M, keeping just one of the rows of the duplicated value and removing entiely the row/rows of where the duplications appeared. And also retaining rows where values in the column have not be duplicated. Then go through each of the other four columns. of the worksheet doing the same for each of the other four columns.


    Then after the duplicates are removed from all five columns provide a tally at the bottom of the worksheet of how many times a value now appears in the cells of each of repective columns.


    Perhaps it would be good to produce the output in a new worksheet in the same format and retain the original worksheet.


    Hope someone can help.....I have been trying to get this work for over a week from some code Dave provided but I just dont have the level of knowledge.


    Your help would be greatly appreciated


    Regards
    Tony Mos

  • Re: Remove Duplicates Of Multiple Columns


    hi Bryce


    I have attached the original data ...in this file is a single worksheet which has both an example of the format of the worksheet with test data and underneath in same worksheet is how it would look after the needed code would have run.


    The various pieces of code I tried to work have been various modiifcation of the code posted by DAVE ....my variations were not even close. So far Dave's piece of posted code is the closest I have got to solution what I would like to be able to achieve.


    Hoping you might be able to help


    Regards


    Tony[hr]*[/hr] Auto Merged Post;[dl]*[/dl]Bumping

  • Re: Remove Duplicates Of Multiple Columns


    Tony - Here is my solution and to be fair its not even really mine. The meat of this code was provided by jindon who solved an issue on this thread http://www.ozgrid.com/forum/showthread.php?t=74172.


    I am just learning the (or trying to learn) the scripting dictionary which I think is a fantastic tool. I used the select case, because I couldn’t think of a better way to change the variable a. Fortunately, you are only concerned with 5 columns; otherwise the select case might not be the best solution. Anyway, try this and let me know. If anyone else can see something in the code that should have been done another way or you just know a better way, I would love to hear it.


  • Re: Remove Duplicates Of Multiple Columns


    Hi Bryce


    Thanks so much for putting in the time to help me


    Well the code worked well like magic on the test file and worked same on smaller files of about 200 rows or less - although I found running it only once for some strange reason did not delete duplicates in the P column but it would delete these duplicates too if run a second time on the same file being the results of the first run of the code. perhaps it is caused by something about my real file data but I have tried in on several files with same results. But generally when run on files of large number of rows if seemed in just end up in a running loop - not ending.


    Anyway I have tried a few things but just can't get it to run on files of larger numbers of rows, some of my files have about 5000 rows.


    Also can the code tally these rows that contain values after the duplicates are removed.


    If you or anyone else could help - In the last couple of days I have also thought it would be good if the code could delete some other rows but these arent duplicates - so perhaps a second piece of code to also delete those rows where ;


    -in column D if in any part of the name string contains word "CLASS" and there is "X" in same row in any columns of F G H and/or I - then delete the row so long as no value is present in columns M O P Q and R on the same row - as I want to keep those rows.


    THANKS HEAPS


    Tony Mos

  • Re: Remove Duplicates Of Multiple Columns


    Tony - I saw that you wanted other things done, but I want to hash out this first code. Give this a try and let me know how it goes. If I dont hear back from you today, I will not be able to come back to this for 3 weeks. Going on vacation.


  • Re: Remove Duplicates Of Multiple Columns


    Hi Bryce,


    Gee I really hope you get this before you head off on your vacation to let you know it worked and thanks heaps.


    What can I say other than WOW the code work like a dream on a much large file of real data - you SIR are a genious...I wonder how long before I can work out VBA solutions like this.


    Perhaps someone else will read this and take on the challenge on working out the next bit of my problem which as stated before is to add to your code - a code to also tally these rows that contain when a value appears in the column AFTER the duplicates have been are removed.


    Also I have thought it would be good if the code could also delete some other rows but these arent duplicates - so perhaps a second piece of code to also delete those rows where ;


    -in column D if in any part of the name string contains word "CLASS" and there is "X" in same row in any columns of F G H and/or I - then delete the row so long as NO value is present in columns M O P Q and R on the same row - as I want to keep those rows. I have attached the test file again but I have added to it some test data for this part of it - there are 6 new rows at items numbers 5 to 11 - of these 6 new rows file are highlighted in blue and one in orange - if the code works well - as also doing what your first code did to remove all duplicates from columns M O P Q and R the code would now also be able delete the blue rows and leave the orange row - of course in the real files the rows are not coloured.


    Well Bryce again thanks for your help ...if you are heading off to the NSW north coast let me know it would be great to share a coffee with a legend VBA coder.


    Regards


    Tony Mos

  • Re: Remove Duplicates Of Multiple Columns


    Hi Jindon


    Thanks for taking the time to look at this I have made some minor range modifications to code Bryce put togehter so to fit my modified version of my worksheet which is as attached, so Bryce's code that works on the attachment for the first part of what I would like to be able to do is



    The other things I would like to be able to acheive is


    1. for the code to run on any tab name - that is to say the code auto picks up the tab name it is run on rather than me having to rekey different tab names into the code for different worksheet names.


    2. Also I have thought it would be good if the code above could also delete some other rows but these arent duplicates - so perhaps a second piece of code to also delete those rows where ;


    -in column G if in any part of the name string contains word "CLASS" and there is "X" in same row in any columns of I, J, K and/or IL - then delete the row so long as NO value is present in columns P, R, S, T, and U on the same row - as I want to keep those rows. I have attached the modidied test file but I have added to it some test data for this part of it - there are 6 new rows at record numbers 5 to 11 - of these 6 new rows file are highlighted in blue and one in orange - if the code works well - as also doing what the first piece of code did to remove all duplicates from columns P, R, S, T and U the new code would now also be able delete the rows indicated by blue and leave the orange row - of course in the real files the rows are not coloured.




    3. that AFTER ALL require deletions of rows have been completed the final thing is to provide a tally for each worker in column E - of how many rows they have with values appearing in columns I, J, K, L and P, R, S, T, U - now I do have some code to this which soemone else helped me previosly with for other workbook -which I think should be able be intigrated into the code required for this exercise however I havent been able to achieve it-what do you think ?



    Hope you can help - even if you can help with part of this either points 2 or 3 that would be great.


    regards


    Tony Mos

  • Re: Remove Duplicates Of Multiple Columns


    1) In the 1st code, it deletes the duplicates except the first appeared row in each column from 16 to 21.


    Correct?


    What I read from your posts is to delete all the dups include 1st appeared row, Am I wrong?


    2) What is the key to find the correct sheet in the workbook?
    e.g particular header in particular row...etc.

  • Re: Remove Duplicates Of Multiple Columns


    Thanks Jindon so much for having a good think about this


    Yes the code does need to also need to inlcude the first row of data when checking for duplicates.


    Sorry I didnt make it clear - that there will ony ever be one worksheet in a workbook. Infrequently I get a workbook with one worksheet the tab name of the worksheet is most times much like the name of the file but not all the time - but each time it comes I would have to change the name in the code - when I think there must be a way for the code to auto detectthe sheetname on the sheet it is being run on - particuarly so when only one worksheet in the workbook .....what do you thinK Jindon.


    Thanks :)


    Tony Mos

  • Re: Remove Duplicates Of Multiple Columns


    OK then the First & 2nd part to test
    Can you just try this one?


    This suppose to leave delete all the duplicated records.

  • Re: Remove Duplicates Of Multiple Columns


    Hi Jindon


    Well I sort of works ...in that it does delete ...but it deleted absolutly everything in the test file version 9 that I attached - retained only one row of column heading info and same for real file.


    Would you like to have another go at it..... there is no way with my limited VBA knowledge I can do it, I can only learn more from forums like this.


    Hope you might like to keep helping


    Tony Mos :)

  • Re: Remove Duplicates Of Multiple Columns


    OOps...
    How about?


    P.S. I need you to test my code, because I can not use Excel here.[hr]*[/hr] Auto Merged Post;[dl]*[/dl]Sorry I need to go off-line now
    If any error/bug, I will reply to you tomorrow.

    Quote


    2. Also I have thought it would be good if the code above could also delete some other rows but these arent duplicates - so perhaps a second piece of code to also delete those rows where ;


    -in column G if in any part of the name string contains word "CLASS" and there is "X" in same row in any columns of I, J, K and/or IL - then delete the row so long as NO value is present in columns P, R, S, T, and U on the same row - as I want to keep those rows. I have attached the modidied test file but I have added to it some test data for this part of it - there are 6 new rows at record numbers 5 to 11 - of these 6 new rows file are highlighted in blue and one in orange - if the code works well - as also doing what the first piece of code did to remove all duplicates from columns P, R, S, T and U the new code would now also be able delete the rows indicated by blue and leave the orange row - of course in the real files the rows are not coloured.



    Quote


    3. that AFTER ALL require deletions of rows have been completed the final thing is to provide a tally for each worker in column E - of how many rows they have with values appearing in columns I, J, K, L and P, R, S, T, U - now I do have some code to this which soemone else helped me previosly with for other workbook -which I think should be able be intigrated into the code required for this exercise however I havent been able to achieve it-what do you think ?


    [hr]*[/hr]

Participate now!

Don’t have an account yet? Register yourself now and be a part of our community!