Remove Duplicates By Id's

  • Would like have some feedback on how to remove the duplicate by ID..
    Ex. ( 3 Column only)
    A. ID
    B. Name
    C. Text ( this column need to be removed all the duplicates by ID )


    Sample data:


    19 John NH MK CL CN
    19 Kerry JK MK CN AL
    16 Dii JK MK CN AL
    15 John SS


    Desired Result:


    19 John NH CL
    19 Kerry JK AL
    16 Dii JK MK CN AL
    15 John SS



    condition details - If the two record’s have a same ID’s ( our ex. Like ID 19) then would like to compare column three ( Text column – single string) , and the column can have a multiple code’s or single code ex. MK or Multiple code like a NH MK CL CN. And each code consist two character only, but how many code’s each ID holds varies.


    And this what I want compare, like our sample example


    Row1: 19 John NH MK CL CN
    Row2: 19 Kerry JK MK CN AL


    This case both rows have 4 code’s ( row 1/ column 3 : NH MK CL CN and row 2/ column3: JK MK CN AL) in this case, I would like to remove only codes that show up on both record’s, which is : MK and CN, then our new output looks like :


    19 John NH CL
    19 Kerry JK AL


    But if the two record’s ( text clumn) don’t have any common codes, then the output will remain the same Ex.


    19 John NH KK CL LL
    19 Kerry JK MK CN AL


    This case the output remains same:


    19 John NH KK CL LL
    19 Kerry JK MK CN AL


    and finally if the two row’s don’t have same ID, then don’t compare.


    This are like 1000’s record it’s very difficult to compare manually..


    Thanks.
    Jimmy

  • Re: Remove Duplicates By Id's


    1. Is the data sorted by ID initially?
    2. Will column 3 data always be two characters, space, two characters, etc.?
    3. Would you want the results in a new column or overwrite current data?

  • Re: Remove Duplicates By Id's


    Thanks.
    1. Is the data sorted by ID initially? Yes..


    2. Will column 3 data always be two characters, space, two characters, etc.?
    Yes..
    3. Would you want the results in a new column or overwrite current data?
    Prefered new column.

  • Re: Remove Duplicates By Id's


    Hello there!


    Before I begin I just want to say that my code probably isnt the best one, but since I felt I needed some practise I sat down and wrote some code for this.
    I did the assumption that one ID can occur more than just twice, so you have more then just two rows to compare (that would have been SOOOO much easier...). I tried my code a few times, and it worked for me on your example + 2 added lines with the ID 19 (as in your example) and codes that are similar to the ones in your example.


    Heres the code:


    As I said, there are most likely better ways of doing this, so you might want to wait until someones else helps you too.


    I hope I have commented the code enough to understand the logic in it (its not easy all the time)...


    Hope it works for you... :)


    Kind regards
    /Henrik

Participate now!

Don’t have an account yet? Register yourself now and be a part of our community!