在一维空间中寻找最短路径

Question

在一维空间中寻找最短路径

8

在一维数组S中，可能存在属于集合的任意数量的元素。

U:{A,B,C,D,E}

重复是被允许的。
例子：

S  = {E,B,D,C,A,D,A,E,E,D,B,B,A,C}

问题是：

在任何给定的数组S中，我如何确定包含集合U所有元素的最短范围/路径的最有效方法？请注意，该数组无法排序。

在上面的例子中，最短路径是连接数组S的前5个元素。

编辑：
1）集合U的元素数量不是恒定的。

提前感谢您。

- Adham Atta

2

你的第一次尝试是什么？为什么你认为它不够优化，或者说你在哪里卡住了？ - Marcin

另外，你有研究过子字符串搜索算法吗？ - Marcin

为了增加获得答案的机会，您应该至少付出一些努力并展示您目前的情况（您考虑了哪些选项，为什么这些选项可能可行或不可行等）。 - Aleadam

如果S的第一个元素是B而不是E，答案会是什么？{B,D,C,A,D,A,E}？ - James

@James，我认为它是EDBBAC。 - Dante May Code

@Dante：哦，是的，我没看到那个。不过这样解释清楚了。这并不是一个子字符串问题。 - James

5个回答

1

如果我理解问题正确的话，我认为你需要做的是（与语言无关）

int partLen <- U.length;
do {
    Vector subSets <- S.partition(partLen);
    foreach set I in subSets
        if I.isEqualTo(U) then
            return true;
        else
            partLen <- partLen + 1;
} while (partLen <= S.length);
return false;

partition 会将 S 分成任意长度的子集，isEqualTo 可以正确比较这些集合。

- James

分区函数是否产生重叠的子集？ - Adham Atta

是的，对于集合S，如果其大小为5，则会有10个子集。 - James

1

这里有一个简单的算法，它只扫描一次数组，不断检查当前覆盖范围是否比先前看到的覆盖范围短。

为了简单起见，我假设我们可以将A、B、C、D和E映射到整数0-4，以便我们可以轻松地引用数组。我没有仔细检查过它，所以请在一个或两个示例上进行心理/实际运行，以确保它能够完成您想要的功能。

//Cell 0 is the last index at which we saw an A, cell 1 " " saw a B, etc.
int[] mostRecent = new int[U.length];
mostRecent.setAllValsTo(POSITIVE_INFINITY);

int shortestRange = POSITIVE_INFINITY; //We are trying to minimize this number.
int minIndex = 0; //The beginning index of the range
int maxIndex = POSITIVE_INFINITY; //The ending index of the range.

for(int i=0; i< S.length; i++) {
    int currentValue = S[i]; //This value will be 0-4, corresponding to A-E

    mostRecent[currentValue] = i;

    currentMax = mostRecent.findMax(); //beginning of current range
    currentMin = mostRecent.findMin(); //end of current range
    currentRange = currentMax - currentMin;

    if(currentRange < shortestRange) {
        shortestRange = currentRage;
        minIndex = currentMin;
        maxIndex = currentMax;
    }
}

//currentMax and currentMin now carry the starting and ending indices, use them as you see fit.
return shortestRange;

这是O(nk)级别的算法，其中n=S.length，k=U.length。还有很多优化空间可以挖掘，但我不知道能否将最坏情况的时间复杂度降低。

- Cephron

尽管这个算法可以找到路径的最短距离，但它并不能找到路径本身。 - Adham Atta

哦，那很容易。只要在进行操作时记住最小和最大索引即可。我来修改代码。 - Cephron

1

首先在数组中找到不同的元素，这是O(n)的操作。然后使用滑动窗口方法来找到包含所有这些元素的最小跨度。

您可以在这里看到如何找到最小窗口：http://tech-queries.blogspot.com/2010/12/finding-minimum-window-in-array-which.html

- Akash Agrawal

0

以下是我会如何做（伪代码）：

let counters[] be an array such that 
counters[j] = number of occurrences of character j, 
where j = 0 for 'A', j = 1 for 'B', etc.

build counters[] by scanning the original string s

let positions[j][] be an array listing the positions occupied by 
character j in the original string s; note the size of 
positions[j][] is equal to counters[j]

build positions[j][] by scanning the original string s;

let currentPositions[] be an array such that 
positions[j][currentPositions[j]] gives the position of the next 
occurrence of character j in the original string s

initially currentPositions[j] = 0 for every j = [0 .. u.length]

let bestLength = s.length
let bestMin = 0
let bestMax = 0
for i in [0 .. s.length] {
    for j in [0 .. u.length] {
        if ( 
          positions[i][currentPositions[i]] < i and 
          currentPositions[j] + 1 < counters[j]
        )
          currentPositions[j]++
    }
    let min = s.length
    int max = 0
    for j in [0 .. u.length] {
        curPos = positions[j][currentPositions[j]
        if (curPos > max) let max = curPos
        if (curPos < min) let min = curPos
    }
    if (max - min + 1 < bestLength) {
        let bestMin = min
        let bestMax = max
        let bestLength = max - min + 1
    }
}

the shortest path is that starting at bestMin, ending at bestMax, 
and having length bestLength

复杂度为O(nk)，其中n = s.length，k = u.length

- MarcoS

请注意，Adham说U的大小不是恒定的 - 这意味着您的复杂度为O(nk)，其中k是U的大小。（请参见我的解决方案，它似乎与您的类似） - Cephron

@Cephron：你是对的：复杂度是O(nk)（我编辑了我的回答），我们的解决方案很相似。 - MarcoS

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Dante May Code · Accepted Answer

有趣的作业，但你仍然需要自己编码。

好消息是你没有告诉我们你使用哪种语言，所以我认为这是一个信号，表明你已经决定自己编码，这很好。

我的最佳尝试：

为子字符串（范围）有2个指针，一个指向范围的开始（较小的索引），另一个指向结束（较大的索引）。两者首先都指向数组的开头。

为范围内每个ABCDE分别设置一个列表，记录它们出现的次数。

从左到右迭代end。

对于每个字符，将列表中该字符的计数器加1。如果结果（增加的how many）> 1，看看start是否指向相同的字符。如果是，将start向前移动并减1，同时当start指向与之相关的数字> 1的字符时，继续将start向前移动并减1。

如果列表中的ABCDE都>= 1，则我们找到了一个候选范围。将其与最短长度（如果有的话）进行比较，如果更小，则更新最短长度并记录新最短范围的开始索引。