Compare two string for similarity and give a score?

  • Weird question.


    If I have two strings of text (they are product descriptions), can I somehow compare the two strings and give them a percentage of similarity score?


    So two identical strings would be 100% score, strings with a few words off or on would be like 80% (could be based on matching characters or something?)


    Is this possible? Has anyone tried something like this before?






    The real issue is that I have one string which comes with a number, I then look in a massive list of values that match this number, then values also have a string attached to them. I then need to "pick" the value that matches the number, and has the best match to the strings?


    So for example:


    I have the MasterNumber "5" with a string "Lock Nut 12mm"


    I have a list (although in real life this is massive)


    6 - Bolt
    3 - Light
    7 - Plate
    5 - Nut 12mm
    7 - Plate 5mm
    8 - Bush
    9 - Screw
    5 - Lock Nut
    3 - Light 5w
    5 - 12mm Lock Nut


    First the code needs to collect all the 5 values from the list so we have this:


    5 - Nut 12mm
    5 - Lock Nut
    5 - 12mm Lock Nut


    Then we need to compare "Lock Nut 12mm" with the three strings above to find the best match, which would be 12mm Lock Nut


    It doesnt have to be perfect just best match.


    Lol weird I know.


    If there is some sort of compare function/code I was thinking of a code like this



    Thanks in advance for any help/ideas.

  • Re: Compare two string for similarity and give a score?


    Interesting....
    Where is this list?



    If it's a data base might be as simple as seeing how many records contain words targeted. Then returning a sorted list with entries with greatest number of words.

  • Re: Compare two string for similarity and give a score?


    Its in excel, basically we have a parts database, and our client gives us "updates" to this, I need to merge the two essentially.


    But they are different formats etc and ours has extra information.

  • Re: Compare two string for similarity and give a score?


    Similar to what I wrote years ago.

  • Re: Compare two string for similarity and give a score?


    Quote from jindon;703612

    Similar to what I wrote years ago.


    Ok wow, its confusing me. How would I use this in my example code.


    Thanks for the effort by the way :)

  • Re: Compare two string for similarity and give a score?


    Nice jarko....


    When is the site gonna get a "LIKE" button... LOL be a FB groupie...

  • Re: Compare two string for similarity and give a score?


    Quote from stildawn;703614

    Ok wow, its confusing me. How would I use this in my example code.


    Thanks for the effort by the way :)


    I have no idea about what your code is doing.

  • Re: Compare two string for similarity and give a score?


    Quote from jarko28;703613

    There is a Microsoft add in tht does exactly what you need I think:


    http://www.microsoft.com/en-ca…oad/details.aspx?id=15011


    This add in, does it need to be installed on everyones PC or will it attach itself to a workbook? The issue is that I am building this, but lots of people will be using it, all on different machines?




    jindon


    Can you explain how to use your function then and what it does so I can adapt it?

  • Re: Compare two string for similarity and give a score?


    Hope you can adjust

    Code
    Sub test()
        Dim x
        With Sheets("sheet1")
            x = LookLike(.Range("e1").Value, .Range("f1").Value, .Range("a1:b10"))
        End With
        MsgBox x
    End Sub


    LookLike
    Arg1: Ref No in fist col of Data range (Range("a1:b10"))
    Arg2: String to match
    Arg3: Data range

  • Re: Compare two string for similarity and give a score?


    Thanks Jindon


    Can you explain Arg1 though more?


    I understand Arg2 is the string I am trying to compare to, and Arg3 is the range of cells that contain comparably strings.

  • Re: Compare two string for similarity and give a score?


    You said,


    Arg1 is the value for that in column1 of data range.

Participate now!

Don’t have an account yet? Register yourself now and be a part of our community!