Dynamic Programming Edit Distance

Edit Distance or Levenshtein Distance

Welcome! This is part eight in a series of articles on Dynamic Programming. You can start at the beginning of this series if you wish by clicking here. Today we are going to talk about “Edit Distance” or “Levenshtein distance” problem which has many applications such as spell checking. Let us get started.

Problem Definition

Given two input strings (S1, S2) of different lengths (m, n), we need to convert the first string into the second string using a minimum number (or minimum cost) of edit operations. Obviously one can replace existing characters in the first string with characters from the second string, delete existing characters or insert new characters into the first string. Each edit operation has a cost for example there is a cost to replace (CR), delete (CD) or insert (CI) a character.

Solution

What about trying a brute force approach to solve this problem? Let us try that. If the two strings are of the same size then we can loop in the first string character by character then compare the current character with the corresponding character in the second string. If the characters are the same we continue otherwise we replace the current character. If the strings are of different sizes then we loop exactly the same as we did before replacing characters when they are different skipping when they are the same. If the first string is the smaller one (m < n) then we need to insert all extra characters from the second string into the first string otherwise if (n < m) then we need to delete all the extra characters from the first string. In either case a single loop is needed but the number of edit operations might not be the optimum that is why Dynamic Programming is a better approach. Take a look at the following example where we need to convert “Format” into “Forest”. For simplicity let us assume that all operations have the same cost of (1) however that might not be the case in practice. The following partial recursive flow demonstrates possible edit operations that we may choose to convert “Format” into “Forest”:

From the flow above, we can easily notice that the optimal conversion can be achieved by two replacements. Replace the (a) with (s) then replace the (m) with (e) however all other paths require more than two edit operations. Note that we assumed all edit operations have the same cost (not always necessarily the case) but in case they do not then there is a chance that the two replacements is not an optimum solution. We do not need to worry much about this issue for now because Dynamic Programming approach will eventually examine all possible paths and produce an optimal solution at the end regardless of the number of operations needed as long as the cost function is provided in advance. On the other hand if the cost of each operation is the same then the optimal solution is directly related to the number of edit operations. Let us describe the conversion algorithm in words before defining that formally. We compare the last character in each string. We have three possible operations that we can do. The first option is to delete the last character from the first string then convert the remaining characters in the first string to the second string. The second option is to insert the last character from the second string into the end of the first string then convert the first string (before the insertion) into the remaining characters of the second string (excluding the last character). The third option is to replace the last character in the first string with the last character from the second string if they are different otherwise no operation is needed. If you think about it for a minute this is indeed recursion because the last character in any given string can be generalized as the last character in a sub-string ending at some position in the string. Note also that sub problems introduced by recursion overlap (that is why Dynamic Programming is a good approach in our case) for example [Form]st and “[Form]est” both repeat in different places in the recursion tree above. In Dynamic Programming problems that repeat are calculated once and looked up in a table if they are already computed. Characterize the structure of an optimal solution

We already characterized the structure of an optimal solution in the example we provided earlier when we described the recursion tree in words so let us ask ourselves now what exactly we are trying to optimize. The answer is minimizing the total number of edit operations to convert one string to another or generally speaking minimizing the total cost to convert a string to another. In the next step we will provide a formal recursive definition for such an optimized cost.

Recursively define the value of an optimal solution

Let M[i, j] be the minimum cost to convert the sub-string ending at position (i) in the first string into a sub-string ending at position (j) in the second string. Obviously the minimum cost to convert the first string in full into the second string is M[m, n] where (m) is the length of the first string and (n) is the length of the second string. Without any further delay here is the recursive formula:

Compute the value of an optimal solution in a bottom up fashion

Note that filling M[i, j] is one part of the solution. The other part is printing out the exact edit operations needed to convert the first string into the second string with minimum cost. In order to do so we will use another two dimensional array (or table) to keep track of the needed edit operations. Through out this series of articles on Dynamic Programming we have learned that these tracking arrays are called back pointers arrays. With that said let us define b[i, j] to be our back pointers array that indicates what operation is needed to convert the sub-string ending at position (i) in the first string to the sub-string ending at position (j) in the other string. Recall that we have four different possibilities which are delete, insert, replace and skip so we can mark a deletion operation by saving “Delete” in b[i,j], an insert operation by saving an “Insert” in b[i, j] and so on. We are now ready to fill M[i, j] and b[i, j] in a bottom up fashion. To simplify the code we will assume each edit operation has a cost of (1). Here is the pseudo code to do that:

Once M[i, j] is completely populated then the minimum cost will be M[m, n].

Construct an optimal solution from the computed information

The last step is to construct the actual solution for the edit distance problem which is printing the exact edit operations needed to convert the first string into the second string with minimum cost. This can be achieved using the back pointers array b[i, j]. The following pseudo code recursive function prints the needed steps.

We are done with part 8. In the next part I will explain the Dynamic Programming problem “Moving on a Checkerboard“.

Please use the comments section below for questions, corrections or feedback. Thanks for reading.

5 Comments

Add a Comment

Your email address will not be published. Required fields are marked *