给定两个字符串,判断它们是否只有一个编辑操作的距离。

12

最近我遇到了这个问题:

Given two strings, return true if they are one edit away from each other,else return false.
An edit is insert/replace/delete a character. 
Ex. {"abc","ab"}->true, {"abc","adc"}->true, {"abc","cab"}->false

解决这个问题的一种方法是使用动态规划找到两个字符串之间的编辑距离,并检查它是否为1。这将需要O(N2)时间。有没有一种线性时间的方法,因为我们只需要检查它们是否相差1次编辑?

我下面写的代码适用于大多数情况,但在 {"m",""} 这样的案例中会失败。

public boolean isOneEditDistance(String s1,String s2){
    if(s1==null || s2==null)
        return true;
    if(s1.equals(s2))
        return false;
    if (Math.abs(s1.length() - s2.length()) > 1)
        return false;
    int i = -1;
    int j = -1;
    while (true)
    {
        i++;
        j++;
        if (i == s1.length())
            return true;
        if (s1.charAt(i) == s2.charAt(j))
            continue;
        if (i != j)
            return false;
        if (s1.length() < s2.length())
            i--;
        else
            j--;
    }
}

“one edit away” 对我来说不是很清楚。像“ab”和“abdc”这样的字符串算作一次编辑吗? - amit dayama
1
这很明显 - 那将是2次删除,因此需要2次编辑。 - BrokenGlass
@amitdayama:不行。要将“ab”转换为“abdc”,您需要进行两次插入(或者反过来进行两次删除)。 - codewarrior
2
这是一个与http://stackoverflow.com/questions/12632825/how-to-efficiently-check-if-the-levenshtein-edit-distance-between-two-string-is?rq=1相同的副本吗? - BrokenGlass
20个回答

13

以下是O(n)时间复杂度找到一次编辑的解决方案。我已经在实现中涵盖了以下情况。

  1. 两个输入字符串的长度差异不应超过1。
  2. 当字符串长度相同时,不同字符的数量不应超过1。
  3. 如果长度差为1,则可以在短字符串中插入一个字符或从长字符串中删除一个字符。考虑到这一点,不同字符的数量不应超过1。
private static boolean isOneEdit(String first, String second) {
    // if the input string are same
    if (first.equals(second))
        return false;

    int len1 = first.length();
    int len2 = second.length();
    // If the length difference of the stings is more than 1, return false.
    if ((len1 - len2) > 1 || (len2 - len1) > 1  ) {
        return false;
    }
    int i = 0, j = 0;
    int diff = 0;
    while (i<len1 && j<len2) {
        char f = first.charAt(i);
        char s = second.charAt(j);
        if (f != s) {
            diff++;
            if (len1 > len2)
                i++;
            if (len2 > len1)
                j++;
            if (len1 == len2)
                i++; j++;
        }
        else{
            i++; j++;
        }
        if (diff > 1) {
            return false;
        }
    }
    // If the length of the string is not same. ex. "abc" and "abde" are not one edit distance.
    if (diff == 1 && len1 != len2 && (i != len1 || j != len2)) {
        return false;
    }
    return true;
}

我已经在中间更新了代码。请再次检查。它返回的是 true。这段代码处理这种情况的部分是 if (len1 > len2) i++; - YoungHobbit
1
代码在 {"abc","afcc"} 情况下失败。我认为只有当两个对应的字符相等时,i 和 j 才应该同时递增。 - codewarrior
1
代码已更新,请检查。感谢您注意到这个问题。 - YoungHobbit
1
如果长度相同,我们需要为i,j都加上增量。已经添加了这部分。例如abcabd,它返回的是false - YoungHobbit
另外,我们可能需要考虑两个字符串完全相同时的情况。 如果(diff == 0) return false; - vasa
@vasa 我已经添加了这个条件,最初我认为使用者可以在调用之前检查它。但是检查它并向调用者返回正确的结果是有意义的。感谢您的反馈。 - YoungHobbit

4
在动态规划方法中,通常会使用矩阵。行对应一个字符串,列对应另一个字符串。关键是要找到从左上角到右下角的最便宜路径。在任何时候,水平或垂直转换代表插入。
您的问题相同,但路径是受限制的。对于 k 次插入/删除,路径必须限制在 k 对角线内。除此之外,经典的 DP 算法应该可行。复杂度是线性的。

你能详细阐述一下你的答案以便更深入地理解吗? - Dilip Kumar
通常情况下,对于这个问题,会构建一个二维矩阵如此。如果最多只允许一个编辑距离,你不需要填满整个矩阵,只需在对角线周围填写一条带状区域(对角线以外的部分都超过一个编辑距离)。这个带状区域的大小与输入尺寸成线性关系。 - Ami Tavory

1
static boolean isEditDistanceOne(String s1, String s2)
    {
        // Find lengths of given strings
        int m = s1.length(), n = s2.length();

        // If difference between lengths is more than
        // 1, then strings can't be at one distance
        if (Math.abs(m - n) > 1)
            return false;

        int count = 0; // Count of edits

        int i = 0, j = 0;
        while (i < m && j < n)
        {
            // If current characters don't match
            if (s1.charAt(i) != s2.charAt(j))
            {
                if (count == 1)return false;

                // If length of one string is
                // more, then only possible edit
                // is to remove a character
                if (m > n) i++;
                else if (m< n) j++;
                else //Iflengths of both strings is same
                {
                    i++;
                    j++;
                }

                // Increment count of edits 
                count++;
            }

            else // If current characters match
            {
                i++;
                j++;
            }
        }

        // If last character is extra in any string
        if (i < m || j < n)
            count++;

        return count == 1;
    }

1

我用C++解决了这个问题,这是 Khalid Habib 在 this 答案中想要表达的正确版本。下面是解决方案(我也将此解决方案添加到了 Github 上,你可以在 这里 关注)。

#include<bits/stdc++.h>
using namespace std;

// checks if there is only one one different in both arrays
bool oneReplaceAway(string s1, string s2){
    bool firstChangeDone = false;
    int l1 = s1.length();
    // l1 == l2 already
    // int l2 = s2.length();
    int l2 = l1;
    int i=0, j=0;

    while(i<l1 && j<l2){
        if(s1[i] != s2[j]){
            // if first change is already checked then return false as there are more than one dissimilarities
            if(firstChangeDone){
                //cout<<"IGI@"<< i<<" "<<j<<"\n";
                return false;   
            }
            firstChangeDone = true;
        }
        i++;
        j++;
    }
    return true;
}


// checks if one word can be added to one string to create the other one
bool oneInsertAway(string s1, string s2){
    bool firstChangeDone = false;
    int l1 = s1.length();
    int l2 = s2.length();

    int i=0, j=0;

    while(i<l1 && j<l2){
        if(s1[i]!=s2[j]){
            // if the chars at i and j are not equal and i!=j, that means one change is already checked, i.e., it is the second change
            if(i!=j)
                return false;
            j++;
        }
        else{
            i++;
            j++;
        }
    }
    return true;
}

// checks of two strings are One Edit Away
bool oneEditAway(string s1, string s2) {
    int l1 = s1.length();
    int l2 = s2.length();

    // cout<<s1<<" - "<<l1<<"\n"<<s2<<" - "<<l2<<"\n"<<(l1==l2)<<"\n";
    if(l1 == l2){
        return oneReplaceAway(s1, s2);
    }
    else if(l1+1 == l2){
        return oneInsertAway(s1, s2);
    }
    else if(l2+1 == l1){
        return oneInsertAway(s2, s1);
    }
    else
        return false;
}


int main(){
    int t;
    cin>>t;

    while(t--){
        string s1,s2;
        cin>>s1>>s2;

        // cout<<oneEditAway(s1, s2)<<"\n";
        string ans = oneEditAway(s1, s2)==1?"One Edit Awway":"Not one Edit Away";
        cout<<ans<<"\n";
    }
    return 0;
}

1

Java版本可能如下所示:

public static boolean oneEdit(String w1, String w2) 
{
    char[] word1= w1.trim().toCharArray();
    char[] word2 = w2.trim().toCharArray();

    int length1=word1.length;
    int length2=word2.length;

    if(Math.abs(length2-length1) > 1) return false;

    Arrays.sort(word1);
    Arrays.sort(word2);

    int length = word1.length >= word2.length? word2.length:word1.length; //take the minimum length

    int falseCounter=0;
    for(int i=0; i < length; i++ ) {
        if(word1[i] != word2[i] && ++falseCounter > 1){
            return false;
        }
    }
    return true;
}

你能解释一下以上代码中使用的排序如何处理示例 abxde 和 axbdef 吗? - vCillusion

1
一个Java版本。
public class Test {

public static void main(String[] args) {

    System.out.println(fun("car", "cart"));
    System.out.println(fun("cat", "bat"));
    System.out.println(fun("balck", "back"));
}

/*
 * Modifications : add, delete, update
 * 
 * i/p Example Add: a = car b = cart
 * 
 * delete : a = balck b = back
 * 
 * update: a = cat b = bat
 * 
 */
public static boolean fun(String a, String b) {
    Boolean isTestPositive = Boolean.FALSE;
    if (a == null || b == null) {
        return isTestPositive;
    }
    if (a.equals(b)) {
        // No Modifications required
        return isTestPositive;
    }
    // Start comparing
    char[] arrayForA = a.toCharArray();
    char[] arrayForB = b.toCharArray();

    return testCase(arrayForA, arrayForB);

}

public static boolean testCase(char[] a, char[] b) {
    int correctionCount = 0;
    int aLen = a.length;
    int bLen = b.length;
    if (Math.abs(aLen - bLen) > 1) {
        return Boolean.FALSE;
    }
    int minLen = Math.min(aLen, bLen);
    for (int i = 0; i < minLen; i++) {
        if (a[i] != b[i]) {
            ++correctionCount;
            if (correctionCount > 1) {
                return Boolean.FALSE;
            }
            // Test Delete case
            if (b.length > i + 1 && a[i] == b[i + 1]) {
                return testDeleteCase(Arrays.copyOfRange(a, i, a.length - 1),
                        Arrays.copyOfRange(b, i + 1, b.length - 1));
            } else if (a.length > i + 1 && b[i] == a[i + 1]) {
                return testDeleteCase(Arrays.copyOfRange(a, i + 1, a.length - 1),
                        Arrays.copyOfRange(b, i, b.length - 1));
            }

        }

    }
    return Boolean.TRUE;
}

public static boolean testDeleteCase(char[] a, char[] b) {
    int aLen = a.length;
    int bLen = b.length;
    if (Math.abs(aLen - bLen) >= 1) {
        return Boolean.FALSE;
    }
    int minLen = Math.min(aLen, bLen);
    for (int i = 0; i < minLen; i++) {
        if (a[i] != b[i]) {

            return Boolean.FALSE;
        }
    }
    return Boolean.TRUE;
}}

0
    boolean oneEditAway(String first, String second) {
    if (first.length() == second.length()) {
        //call a function which replce the character from the string
    } else if (first.length() + 1 == second.length()) {
        //call a function which remove the character from string
    } else if (first.length() - 1 == second.length()) {
        //call a function which insert the character in the string
    }
    return false;
}

0
这是 Ruby 的实现:
def one_away(s1, s2)
  return false if (s1.size - s2.size).abs > 1

  missed_count = 0
  counter = 0
  s1.each_char do |c|
    if !s2[counter].nil? && (s2[counter] != c)
      missed_count += 1
    end
    counter += 1
    return false if missed_count > 1
  end
  true
end

p one_away('pale', 'bake') #=> false

在执行one_away('string', '1string')时出现错误。 - tnaught

0

Swift 编写的答案,逐步解释:

func isOneEdit(str1: String, str2: String) -> Bool {

// check if they are the same
if str1 == str2 {
    return true
}

let difference = abs(str1.count - str2.count)

// check if the difference between then is bigger than 1
if difference > 1 {
    return false
}

// lets iterate over the words

var i = 0
var j = 0
var changes = 0

while i < str1.count && j < str2.count {
    let char1 = str1[str1.index(str1.startIndex, offsetBy: i)]
    let char2 = str2[str1.index(str2.startIndex, offsetBy: j)]

    // if the difference is 1 we need to move just one index (the one from the bigger word)
    // this is just necessary when the char1 and char2 are different
    if difference == 1 && char1 != char2 {
        if str1.count > str2.count {
            i += 1
        } else {
            j += 1
        }
        changes += 1
    } else {
        // if chars are equal (in this step we don't care about the difference)
        // we move both indexes.
        i += 1
        j += 1

        if char1 != char2 {
            changes += 1
        }
    }
}

return changes <= 1
}

0

在这里,您可以找到用Swift编写的解决方案。

func isOneEdit(str1: String, str2: String) -> Bool {

  //The length difference between two input strings should not be more than 1.
  let diff = abs(str1.characters.count - str2.characters.count)
  if diff > 1 {
    return false
  }

  //When the length of the strings is same, the number of different chars should not be more than 1.
  if diff == 0 {
    var numberOfEdits = 0
    for (char1, char2) in zip(str1.characters, str2.characters) {
      if char1 != char2 {
        numberOfEdits++
      }
    }
    return numberOfEdits < 2
  }

  //If the length difference is 1.
  //then either one char can be inserted in the short string or deleted from the longer string. 
  //Considering that, the number of different char should not be more than 1.

  return str1.rangeOfString(str2) != nil || str2.rangeOfString(str1) != nil



  //return str1.isSubstring(str2) || str2.isSubstring(str1)

}

这个函数需要 O(n) 的时间和常数空间。


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接