Command Palette

Search for a command to run...

Problem: Edit Distance

Our third 2D problem introduces weighted operations: transforming one sequence into another with minimum cost.

Problem Statement

Given two strings word1 and word2, return the minimum number of operations required to convert word1 to word2.

You have the following three operations permitted on a word:

  • Insert a character
  • Delete a character
  • Replace a character

Example 1:

Input: word1 = "horse", word2 = "ros"
Output: 3

Explanation:
horse → rorse (replace 'h' with 'r')
rorse → rose  (delete 'r')
rose  → ros   (delete 'e')

Example 2:

Input: word1 = "intention", word2 = "execution"
Output: 5

Explanation:
intention → inention  (delete 't')
inention  → enention  (replace 'i' with 'e')
enention  → exention  (replace 'n' with 'x')
exention  → exection  (replace 'n' with 'c')
exection  → execution (insert 'u')

Example 3:

Input: word1 = "abc", word2 = "abc"
Output: 0

Explanation: Words are identical, no operations needed.

Understanding Edit Operations

Three allowed operations:

1. Insert

word1 = "abc", want to get "abcd"

Operation: Insert 'd' at the end

Cost: 1 operation

2. Delete

word1 = "abcd", want to get "abc"

Operation: Delete 'd'

Cost: 1 operation

3. Replace

word1 = "abc", want to get "abd"

Operation: Replace 'c' with 'd'

Cost: 1 operation

Goal: Find the minimum number of operations to transform word1 into word2.

Why This is Hard (Brute Force)

Many possible sequences of operations!

For "horse" → "ros":

  • Replace h→r, delete o, delete r, delete s, delete e? (5 ops)
  • Delete h, delete o, delete r, delete s, delete e, insert r, insert o, insert s? (8 ops)
  • Replace h→r, delete r, delete e? (3 ops) ✓ Best!

Exponential possibilities, DP finds optimal in O(m × n).

Step 1: Define the State

What should dp[i][j] represent?

Definition: dp[i][j] = minimum edit distance to convert word1[0...i-1] to word2[0...j-1]

In other words: Minimum operations to convert the first i characters of word1 into the first j characters of word2.

Answer: dp[m][n] where m = word1.length, n = word2.length

Step 2: Identify Base Cases

What are the simplest subproblems?

dp[0][j]: Convert Empty String to word2[0...j-1]

Only option: Insert all j characters from word2

Operations: j insertions

Base case: dp[0][j] = j for all j

Example: "" → "ros" requires 3 insertions

dp[i][0]: Convert word1[0...i-1] to Empty String

Only option: Delete all i characters from word1

Operations: i deletions

Base case: dp[i][0] = i for all i

Example: "horse" → "" requires 5 deletions

Base Case Summary

          ""  r  o  s
    ""    0   1  2  3
    h     1   ?  ?  ?
    o     2   ?  ?  ?
    r     3   ?  ?  ?
    s     4   ?  ?  ?
    e     5   ?  ?  ?

First row: 0, 1, 2, 3 (insert each character of word2) First column: 0, 1, 2, 3, 4, 5 (delete each character of word1)

Step 3: Find the Recurrence Relation

For dp[i][j], we're comparing word1[i-1] and word2[j-1].

Case 1: Characters Match

If word1[i-1] == word2[j-1]:

No operation needed for this character! Just use the result from before.

dp[i][j] = dp[i-1][j-1]

Example:

  • word1 = "abc", word2 = "abc"
  • At i=3, j=3: word1[2]='c', word2[2]='c' → Match!
  • dp[3][3] = dp[2][2] (no operation for 'c')

Case 2: Characters Don't Match

If word1[i-1] != word2[j-1]:

We must perform an operation. Three options:

Option 1: Replace

Replace word1[i-1] with word2[j-1].

After replacement: Both characters at positions i-1 and j-1 are now the same.

Remaining problem: Convert word1[0...i-2] to word2[0...j-2]

Cost: 1 + dp[i-1][j-1]

Example:

  • "horse" → "ros"
  • Replace 'h' with 'r': "rorse" → "ros"
  • Now solve: "orse" → "os"

Option 2: Insert

Insert word2[j-1] into word1 at position i.

After insertion: word1 now has word2[j-1] at the right position.

Remaining problem: Convert word1[0...i-1] to word2[0...j-2]

Cost: 1 + dp[i][j-1]

Intuition: If we insert word2[j-1], we've "handled" position j in word2, so move j back.

Example:

  • "horse" → "ros"
  • Insert 's' at end: "horses" → "ros"
  • Now solve: "horses" → "ro"

Option 3: Delete

Delete word1[i-1] from word1.

After deletion: word1 is shorter, position i-1 is gone.

Remaining problem: Convert word1[0...i-2] to word2[0...j-1]

Cost: 1 + dp[i-1][j]

Example:

  • "horse" → "ros"
  • Delete 'e': "hors" → "ros"
  • Now solve: "hors" → "ros"

Choose the Minimum

dp[i][j] = 1 + min(dp[i-1][j-1],  // Replace
                   dp[i][j-1],     // Insert
                   dp[i-1][j])     // Delete

Complete Recurrence

Java
if (word1.charAt(i - 1) == word2.charAt(j - 1)) {
    dp[i][j] = dp[i - 1][j - 1];  // No operation
} else {
    int replace = dp[i - 1][j - 1];
    int insert = dp[i][j - 1];
    int delete = dp[i - 1][j];
    dp[i][j] = 1 + Math.min(replace, Math.min(insert, delete));
}

Step 4: Implementation

Java
public class EditDistance {
    public int minDistance(String word1, String word2) {
        int m = word1.length();
        int n = word2.length();

        // Create DP table
        // dp[i][j] = edit distance between word1[0...i-1] and word2[0...j-1]
        int[][] dp = new int[m + 1][n + 1];

        // Base cases
        // Convert empty string to word2[0...j-1]: j insertions
        for (int j = 0; j <= n; j++) {
            dp[0][j] = j;
        }

        // Convert word1[0...i-1] to empty string: i deletions
        for (int i = 0; i <= m; i++) {
            dp[i][0] = i;
        }

        // Fill table
        for (int i = 1; i <= m; i++) {
            for (int j = 1; j <= n; j++) {
                if (word1.charAt(i - 1) == word2.charAt(j - 1)) {
                    // Characters match: no operation needed
                    dp[i][j] = dp[i - 1][j - 1];
                } else {
                    // Characters don't match: try all three operations
                    int replace = dp[i - 1][j - 1];
                    int insert = dp[i][j - 1];
                    int delete = dp[i - 1][j];
                    dp[i][j] = 1 + Math.min(replace, Math.min(insert, delete));
                }
            }
        }

        // Return answer
        return dp[m][n];
    }
}

Step 5: Trace Through an Example

Let's trace minDistance("horse", "ros"):

word1 = "horse" (length 5)
word2 = "ros"   (length 3)

Initialize base cases:

          ""  r  o  s
    ""    0   1  2  3
    h     1   ?  ?  ?
    o     2   ?  ?  ?
    r     3   ?  ?  ?
    s     4   ?  ?  ?
    e     5   ?  ?  ?

Fill table:

i=1, j=1: word1[0]='h', word2[0]='r' → No match
  replace = dp[0][0] = 0
  insert  = dp[1][0] = 1
  delete  = dp[0][1] = 1
  dp[1][1] = 1 + min(0, 1, 1) = 1

i=1, j=2: word1[0]='h', word2[1]='o' → No match
  replace = dp[0][1] = 1
  insert  = dp[1][1] = 1
  delete  = dp[0][2] = 2
  dp[1][2] = 1 + min(1, 1, 2) = 2

i=1, j=3: word1[0]='h', word2[2]='s' → No match
  replace = dp[0][2] = 2
  insert  = dp[1][2] = 2
  delete  = dp[0][3] = 3
  dp[1][3] = 1 + min(2, 2, 3) = 3

          ""  r  o  s
    ""    0   1  2  3
    h     1   1  2  3
    o     2   ?  ?  ?
    r     3   ?  ?  ?
    s     4   ?  ?  ?
    e     5   ?  ?  ?

i=2, j=1: word1[1]='o', word2[0]='r' → No match
  replace = dp[1][0] = 1
  insert  = dp[2][0] = 2
  delete  = dp[1][1] = 1
  dp[2][1] = 1 + min(1, 2, 1) = 2

i=2, j=2: word1[1]='o', word2[1]='o' → Match!
  dp[2][2] = dp[1][1] = 1

i=2, j=3: word1[1]='o', word2[2]='s' → No match
  replace = dp[1][2] = 2
  insert  = dp[2][2] = 1
  delete  = dp[1][3] = 3
  dp[2][3] = 1 + min(2, 1, 3) = 2

          ""  r  o  s
    ""    0   1  2  3
    h     1   1  2  3
    o     2   2  1  2
    r     3   ?  ?  ?
    s     4   ?  ?  ?
    e     5   ?  ?  ?

i=3, j=1: word1[2]='r', word2[0]='r' → Match!
  dp[3][1] = dp[2][0] = 2

i=3, j=2: word1[2]='r', word2[1]='o' → No match
  replace = dp[2][1] = 2
  insert  = dp[3][1] = 2
  delete  = dp[2][2] = 1
  dp[3][2] = 1 + min(2, 2, 1) = 2

i=3, j=3: word1[2]='r', word2[2]='s' → No match
  replace = dp[2][2] = 1
  insert  = dp[3][2] = 2
  delete  = dp[2][3] = 2
  dp[3][3] = 1 + min(1, 2, 2) = 2

          ""  r  o  s
    ""    0   1  2  3
    h     1   1  2  3
    o     2   2  1  2
    r     3   2  2  2
    s     4   ?  ?  ?
    e     5   ?  ?  ?

i=4, j=1: word1[3]='s', word2[0]='r' → No match
  replace = dp[3][0] = 3
  insert  = dp[4][0] = 4
  delete  = dp[3][1] = 2
  dp[4][1] = 1 + min(3, 4, 2) = 3

i=4, j=2: word1[3]='s', word2[1]='o' → No match
  replace = dp[3][1] = 2
  insert  = dp[4][1] = 3
  delete  = dp[3][2] = 2
  dp[4][2] = 1 + min(2, 3, 2) = 3

i=4, j=3: word1[3]='s', word2[2]='s' → Match!
  dp[4][3] = dp[3][2] = 2

          ""  r  o  s
    ""    0   1  2  3
    h     1   1  2  3
    o     2   2  1  2
    r     3   2  2  2
    s     4   3  3  2
    e     5   ?  ?  ?

i=5, j=1: word1[4]='e', word2[0]='r' → No match
  replace = dp[4][0] = 4
  insert  = dp[5][0] = 5
  delete  = dp[4][1] = 3
  dp[5][1] = 1 + min(4, 5, 3) = 4

i=5, j=2: word1[4]='e', word2[1]='o' → No match
  replace = dp[4][1] = 3
  insert  = dp[5][1] = 4
  delete  = dp[4][2] = 3
  dp[5][2] = 1 + min(3, 4, 3) = 4

i=5, j=3: word1[4]='e', word2[2]='s' → No match
  replace = dp[4][2] = 3
  insert  = dp[5][2] = 4
  delete  = dp[4][3] = 2
  dp[5][3] = 1 + min(3, 4, 2) = 3

Final table:

          ""  r  o  s
    ""    0   1  2  3
    h     1   1  2  3
    o     2   2  1  2
    r     3   2  2  2
    s     4   3  3  2
    e     5   4  4  3

Return dp[5][3] = 3 ✓

Minimum edit distance is 3!

Reconstructing the Operations

Similar to LCS backtracking:

Java
public void printOperations(String word1, String word2, int[][] dp) {
    int i = word1.length();
    int j = word2.length();

    while (i > 0 && j > 0) {
        if (word1.charAt(i - 1) == word2.charAt(j - 1)) {
            // Characters match, no operation
            i--;
            j--;
        } else {
            int replace = dp[i - 1][j - 1];
            int insert = dp[i][j - 1];
            int delete = dp[i - 1][j];

            if (replace <= insert && replace <= delete) {
                System.out.println("Replace " + word1.charAt(i - 1) + " with " + word2.charAt(j - 1));
                i--;
                j--;
            } else if (delete <= insert) {
                System.out.println("Delete " + word1.charAt(i - 1));
                i--;
            } else {
                System.out.println("Insert " + word2.charAt(j - 1));
                j--;
            }
        }
    }

    // Handle remaining characters
    while (i > 0) {
        System.out.println("Delete " + word1.charAt(i - 1));
        i--;
    }
    while (j > 0) {
        System.out.println("Insert " + word2.charAt(j - 1));
        j--;
    }
}

For "horse" → "ros":

Delete 'e'
Delete 's'
Delete 'r'

Wait, that's 3 deletes but not optimal path shown in example...

Let me trace backwards from dp[5][3]=3:

i=5, j=3: word1[4]='e', word2[2]='s'
  dp[4][3]=2, dp[5][2]=4, dp[4][2]=3
  Delete wins (dp[4][3]=2), so delete 'e'
  Move to i=4, j=3

i=4, j=3: word1[3]='s', word2[2]='s' → Match
  Move to i=3, j=2

i=3, j=2: word1[2]='r', word2[1]='o'
  dp[2][1]=2, dp[3][1]=2, dp[2][2]=1
  Delete wins (dp[2][2]=1), so delete 'r'
  Move to i=2, j=2

i=2, j=2: word1[1]='o', word2[1]='o' → Match
  Move to i=1, j=1

i=1, j=1: word1[0]='h', word2[0]='r'
  dp[0][0]=0, dp[1][0]=1, dp[0][1]=1
  Replace wins (dp[0][0]=0), so replace 'h' with 'r'
  Move to i=0, j=0

Operations (in reverse order):
1. Replace 'h' with 'r': "horse" → "rorse"
2. Delete 'r': "rorse" → "rose"
3. Delete 'e': "rose" → "ros"

Matches the example!

Complexity Analysis

Time Complexity: O(m × n)

  • Two nested loops
  • Each cell: O(1) work
  • Total: O(m × n)

Space Complexity: O(m × n)

  • DP table of size (m+1) × (n+1)
  • Total: O(m × n)

Space Optimization

Same as LCS: We only need the previous row.

Java
public int minDistance(String word1, String word2) {
    int m = word1.length();
    int n = word2.length();

    int[] dp = new int[n + 1];

    // Initialize first row
    for (int j = 0; j <= n; j++) {
        dp[j] = j;
    }

    for (int i = 1; i <= m; i++) {
        int prev = dp[0];  // Saves dp[i-1][j-1]
        dp[0] = i;         // First column: i deletions

        for (int j = 1; j <= n; j++) {
            int temp = dp[j];  // Save dp[i-1][j]
            if (word1.charAt(i - 1) == word2.charAt(j - 1)) {
                dp[j] = prev;
            } else {
                dp[j] = 1 + Math.min(prev, Math.min(dp[j], dp[j - 1]));
            }
            prev = temp;
        }
    }

    return dp[n];
}

Space complexity: O(n)

Real-World Applications

Edit distance (Levenshtein distance) is used for:

  1. Spell checking: Find closest correctly-spelled word
  2. DNA sequence alignment: Compare genetic sequences
  3. Plagiarism detection: Measure text similarity
  4. Speech recognition: Match spoken words to dictionary
  5. Autocorrect: Suggest corrections for typos
  6. Machine translation: Evaluate translation quality

Example: Spell check

User types: "recieve"

Distances to dictionary words (Levenshtein: insert/delete/replace):

  • "receive": 2 (two edits; transposition isn't an allowed single operation)
  • "relieve": 1 (substitute 'c'→'l')
  • "believe": 2 (substitute 'r'→'b' and 'c'→'l')

Suggestion: "relieve" (smallest edit distance)

Variations

Variation 1: Weighted Operations

Different costs for different operations:

  • Insert: cost 1
  • Delete: cost 1
  • Replace: cost 2

Modification:

Java
dp[i][j] = Math.min(dp[i-1][j-1] + replaceCost,
                    dp[i][j-1] + insertCost,
                    dp[i-1][j] + deleteCost);

Variation 2: Delete Distance

Only delete operations allowed (no insert, no replace).

Problem: Minimum deletions from both strings to make them equal.

Solution: Use LCS!

Delete distance = (m - LCS) + (n - LCS) = m + n - 2 * LCS

Key Takeaways

Edit Distance demonstrates:

  • Weighted operations: Each choice has a cost
  • Three-way recurrence: More complex than two-way
  • Practical importance: Real-world applications in many fields

Pattern Recognition:

This is a sequence transformation problem where:

  • State is (position in source, position in target)
  • Recurrence considers multiple operations
  • Base cases are converting to/from empty string

This pattern appears in:

  • Minimum ASCII Delete Sum
  • Wildcard Matching
  • Regular Expression Matching

Edit Distance is one of the most important DP problems due to its wide applicability.

What's Next

Our final 2D problem, 0/1 Knapsack, introduces a different pattern: selection with constraints. Instead of comparing sequences, we'll select items to maximize value under a weight constraint.