INTRODUCTION
The Levenshtein Distance algorithm is an algorithm used to calculate the minimum number of edits required to transform one string into another string using addition, deletion, and substitution of characters.
USES
The most common use of the function is for approximate string matching. Since the function returns the minimum number of edits required to transform one string into another, the user can set the threshhold at which one string is considered a match to another.
CODE
The function below accepts two strings as its input and returns an integer representing the minimum number of required edits to transform string 1 into string 2.
Because the computation time is O(n*m), you should be wary about using it on very long strings.
A common modification of this algorithm is to allow transposition of characters. Allowing transposition of adjacent characters is the Damerau-Levenshtein Distance Algorithm.
The Levenshtein Distance algorithm is an algorithm used to calculate the minimum number of edits required to transform one string into another string using addition, deletion, and substitution of characters.
USES
The most common use of the function is for approximate string matching. Since the function returns the minimum number of edits required to transform one string into another, the user can set the threshhold at which one string is considered a match to another.
CODE
The function below accepts two strings as its input and returns an integer representing the minimum number of required edits to transform string 1 into string 2.
Because the computation time is O(n*m), you should be wary about using it on very long strings.
A common modification of this algorithm is to allow transposition of characters. Allowing transposition of adjacent characters is the Damerau-Levenshtein Distance Algorithm.
Code:
Function Levenshtein(str1 As String, str2 As String) As Integer Dim arrLev, intLen1 As Integer, intLen2 As Integer, i As Integer Dim j As Integer, arrStr1, arrStr2, intMini, As Integer intLen1 = Len(str1) ReDim arrStr1(intLen1 + 1) intLen2 = Len(str2) ReDim arrStr2(intLen2 + 1) ReDim arrLev(intLen1 + 1, intLen2 + 1) arrLev(0, 0) = 0 For i = 1 To intLen1 arrLev(i, 0) = i arrStr1(i) = Mid(str1, i, 1) Next For j = 1 To intLen2 arrLev(0, j) = j arrStr2(j) = Mid(str2, j, 1) Next For j = 1 To intLen2 For i = 1 To intLen1 If arrStr1(i) = arrStr2(j) Then arrLev(i, j) = arrLev(i-1, j-1) Else intMini = arrLev(i-1, j) 'deletion If intMini > arrLev(i, j-1) Then intMini = arrLev(i, j-1) 'insertion If intMini > arrLev(i-1, j-1) Then intMini = arrLev(i-1, j-1) 'deletion arrLev(i, j) = intMini + 1 End If Next Next Levenshtein = arrLev(intLen1, intLen2) End Function
Comment