Compare 2 Columns For Close Matches

  • Hi guys, I'm wondering if I could get some advice on this problem I have. I want to compare two columns (from two different files, but copied into a new sheet) for duplicates or identical entries.


    But here's the trick, each entry contains atleast a number of digits. I'm not quite sure how to find items that may be 80% identical (in the right order, but might be messing a digit, or there is a dash or a 'o' instead of a '0'). I know how to do it in C++ or Java, but not quite sure how to do it in VBA/macro.


    Here's an example of what I meant
    Let's say column one has the following:
    0244-34
    9393-O0
    3932-44


    and second column has
    939300
    493384
    938923


    If I was comparing second column to first column, it can pick up the 9393-O0 one because it's close enough to 939300.


    Hope to hear from someone soon, thanks!

  • Re: Comparing Two Columns For Duplicates


    Why not just do it in C++? The logic should be the same....


    You'd have to do something like test each character for an entry against each character of the other list and if 4 out of 5 match - or if 8 out of 10, etc - then count it as a match. I would use left() and right() to cut a string down by 1 character each time and compare it every time to the entry.

  • Re: Comparing Two Columns For Duplicates


    That's a good idea! Hrm, is there any possibility you could give me an example of an Excel codes that does the LEFT or RIGHT thing?


    You can use C++ in Excel?

  • Re: Comparing Two Columns For Duplicates


    Quote from sweetmonster

    Hrm, is there any possibility you could give me an example of an Excel codes that does the LEFT or RIGHT thing?


    You can check the vba help file for a good explanation of the left and right function.


    Search for Left Function reveals:


    Left Function


    Returns a Variant (String) containing a specified number of characters from the left side of a string.


    Syntax


    Left(string, length)


    The Left function syntax has these named arguments:


    Part Description
    string Required. String expression from which the leftmost characters are returned. If string contains Null, Null is returned.
    length Required; Variant (Long). Numeric expression indicating how many characters to return. If 0, a zero-length string ("") is returned. If greater than or equal to the number of characters in string, the entire string is returned.


    Remarks


    To determine the number of characters in string, use the Len function.


    Note Use the LeftB function with byte data contained in a string. Instead of specifying the number of characters to return, length specifies the number of bytes.

  • Re: Comparing Two Columns For Duplicates


    Thanks for the info! I'll check it over very soon!


    just one last question regarding the LEFT or RIGHT functions


    if I wrote something like this:


    Code
    character_length = len(string)
    character_to_compare = LEFT(string, character_length -1)
    // Do fancy if loop here and do comparison with other column
    character_length = character_length -1;
    // end if


    then I would start on a character and will go to the next character after the if loop?

  • Re: Comparing Two Columns For Duplicates


    No, C++ syntax will break your code ;)


    What I mean is, the logic behind it is the same - the commands are different but the thought process you should go through is the same.


    I would do something like




    I was assuming you have a basis for C++ and you'd know what boolean means and stuff like that, ask if it still looks like greek to you :)

  • Re: Comparing Two Columns For Duplicates


    Andrew: haha, yah, I would have thought C++ would break my code :P I thought you meant there was a way to do C++ coding in Excel, haha.. if that was the case, I would become an Excel genius over night, haha!! yes, I'm quite familiar with C++. I knew the logic, it's just that according to the Rules on this forum, I couldn't mention a slight hint of a possible answer so I kept it reserved and wanted to see what people had thought of. Maybe I interpreted the rules wrong, I dunno. Thanks for the codes! This makes total sense to me now!


    I know what to do from here now, thanks everyone!

Participate now!

Don’t have an account yet? Register yourself now and be a part of our community!