如何在O(n)时间内找到最小的正连续子序列?

11

我们有一个用于在O(n)时间内查找给定序列中最大正子序列的算法。能否有人提出类似的算法来查找最小正连续子序列。

例如, 如果给定的序列是1,2,3,4,5,则答案应为1。 [5,-4,3,5,4] -> 1是元素[5,-4]的最小正和。


嗯...它看起来有点像子集和问题:https://en.wikipedia.org/wiki/Subset_sum_problem,只是你想要最小值,而不是检查一个子集是否具有特定的总和。我不确定这些算法中的哪一个可以合理地适应,但值得研究一下。 - Amy Teegarden
可能存在一个O(n log n)的算法,但我怀疑这个问题不存在O(n)的算法。 - Juan Lopes
更改符号并找到最大总和,与我们计算最大总和子数组的方式相同(Kadane算法!)。然后更改数组中元素的符号以使其恢复初始状态。 - Shubham Sharma
1
@ShubhamSharma:你的方法会找到一个负数子序列,如果存在的话,而不是最小的正数子序列。 - j_random_hacker
可能是Is it possible to find two numbers whose difference is minimum in O(n) time的重复问题(如果有疑问,请考虑部分和数组)。 - n. m.
显示剩余3条评论
3个回答

4
这个问题不存在这样的算法。此问题的下限是O(n log n)。我将通过将元素不同问题(实际上是它的非负变体)简化为此问题来证明它。
假设我们有一个O(n)算法来解决此问题(最小非负子数组)。
我们想要找出一个数组(例如A=[1, 2, -3, 4, 2])是否只有不同的元素。为了解决这个问题,我可以构造一个由相邻元素之间的差组成的数组(例如A'=[1, -5, 7, -2]),并运行我们拥有的O(n)算法。原始数组仅具有不同的元素,当且仅当最小非负子数组大于0时。
如果我们有一个O(n)算法来解决你的问题,我们将拥有一个O(n)算法来解决元素不同问题,而我们知道在图灵机上不可能实现这一点。

这个缩减能否转化为对原始的变量的缩减?我没有看到明显的方法。这些看似非常相似的问题可能具有不同的复杂度! - j_random_hacker
@stgatilov 如果我们能找到最小的差异,我们不一定能构造出最小的子数组。但我们对相反的断言感兴趣:如果我们能够构造出最小的子数组,那么我们就能找到最小的差异。这当然是正确的。 - n. m.

3
我们可以采用O(n log n)的算法,具体如下:
假设我们有一个数组prefix,其中索引i存储从0到i的数组A的和,因此子数组(i, j)的和为prefix[j] - prefix[i-1]。
因此,为了找到以索引j结尾的最小正子数组,我们需要找到小于prefix[j]且x
伪代码:
int[]prefix = new int[A.length];
prefix[0] = A[0];
for(int i = 1; i < A.length; i++)
    prefix[i] = A[i] + prefix[i - 1];
int result = MAX_VALUE;
BinarySearchTree tree;
for(int i = 0; i < A.length; i++){
    if(A[i] > 0)
       result = min(result, A[i];
    int v = tree.getMaximumElementLessThan(prefix[i]);
    result = min(result, prefix[i] - v);
    tree.add(prefix[i]);
}

1
我相信有一种O(n)算法,见下文。
注意:它有一个比例因子,在实际应用中可能不太吸引人:它取决于要处理的(输入)值,请参阅代码中的备注。
private int GetMinimumPositiveContiguousSubsequenc(List<Int32> values)
    {
      // Note: this method has no precautions against integer over/underflow, which may occur
      // if large (abs) values are present in the input-list.

      // There must be at least 1 item.
      if (values == null || values.Count == 0)
        throw new ArgumentException("There must be at least one item provided to this method.");

      // 1. Scan once to:
      //    a) Get the mimumum positive element;
      //    b) Get the value of the MAX contiguous sequence
      //    c) Get the value of the MIN contiguous sequence - allowing negative values: the mirror of the MAX contiguous sequence.
      //    d) Pinpoint the (index of the) first negative value.

      int minPositive = 0;

      int maxSequence = 0;
      int currentMaxSequence = 0;

      int minSequence = 0;
      int currentMinSequence = 0;

      int indxFirstNegative = -1;

      for (int k = 0; k < values.Count; k++)
      {
        int value = values[k];

        if (value > 0)
          if (minPositive == 0 || value < minPositive)
            minPositive = value;
          else if (indxFirstNegative == -1 && value < 0)
            indxFirstNegative = k;

        currentMaxSequence += value;
        if (currentMaxSequence <= 0)
          currentMaxSequence = 0;
        else if (currentMaxSequence > maxSequence)
          maxSequence = currentMaxSequence;

        currentMinSequence += value;
        if (currentMinSequence >= 0)
          currentMinSequence = 0;
        else if (currentMinSequence < minSequence)
          minSequence = currentMinSequence;
      }

      // 2. We're done if (a) there are no negatives, or (b) the minPositive (single) value is 1 (or 0...).
      if (minSequence == 0 || minPositive <= 1)
        return minPositive;

      // 3. Real work to do.
      // The strategy is as follows, iterating over the input values:
      // a) Keep track of the cumulative value of ALL items - the sequence that starts with the very first item.
      // b) Register each such cumulative value as "existing" in a bool array 'initialSequence' as we go along.
      //    We know already the max/min contiguous sequence values, so we can properly size that array in advance.
      //    Since negative sequence values occur we'll have an offset to match the index in that bool array
      //    with the corresponding value of the initial sequence.
      // c) For each next input value to process scan the "initialSequence" bool array to see whether relevant entries are TRUE.
      //    We don't need to go over the complete array, as we're only interested in entries that would produce a subsequence with
      //    a value that is positive and also smaller than best-so-far.
      //    (As we go along, the range to check will normally shrink as we get better and better results.
      //     Also: initially the range is already limited by the single-minimum-positive value that we have found.)

      // Performance-wise this approach (which is O(n)) is suitable IFF the number of input values is large (or at least: not small) relative to
      // the spread between maxSequence and minSeqence: the latter two define the size of the array in which we will do (partial) linear traversals.
      // If this condition is not met it may be more efficient to replace the bool array by a (binary) search tree.
      // (which will result in O(n logn) performance).
      // Since we know the relevant parameters at this point, we may below have the two strategies both implemented and decide run-time
      // which to choose.
      // The current implementation has only the fixed bool array approach.

      // Initialize a variable to keep track of the best result 'so far'; it will also be the return value.
      int minPositiveSequence = minPositive;

      // The bool array to keep track of which (total) cumulative values (always with the sequence starting at element #0) have occurred so far,
      // and the 'offset' - see remark 3b above.
      int offset = -minSequence;
      bool[] initialSequence = new bool[maxSequence + offset + 1];

      int valueCumulative = 0;

      for (int k = 0; k < indxFirstNegative; k++)
      {
        int value = values[k];
        valueCumulative += value;
        initialSequence[offset + valueCumulative] = true;
      }

      for (int k = indxFirstNegative; k < values.Count; k++)
      {
        int value = values[k];

        valueCumulative += value;
        initialSequence[offset + valueCumulative] = true;

        // Check whether the difference with any previous "cumulative" may improve the optimum-so-far.

        // the index that, if the entry is TRUE, would yield the best possible result.
        int indexHigh = valueCumulative + offset - 1;
        // the last (lowest) index that, if the entry is TRUE, would still yield an improvement over what we have so far.
        int indexLow = Math.Max(0, valueCumulative + offset - minPositiveSequence + 1);

        for (int indx = indexHigh; indx >= indexLow; indx--)
        {
          if (initialSequence[indx])
          {
            minPositiveSequence = valueCumulative - indx + offset;

            if (minPositiveSequence == 1)
              return minPositiveSequence;

            break;
          }
        }
      }

      return minPositiveSequence;
    }
  }

乍一看,这对我来说似乎非常有帮助。我很快会详细阅读它。 - Abhay Kumar

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接