需要明确解释范围更新和范围查询二进制索引树。

12

我已经学习了一些关于二叉索引树范围更新和范围查询的教程,但我完全无法理解它们。我不明白为什么需要建立另一棵树。

请有人用简单易懂的英语解释并提供一个示例吗?

3个回答

10

试图以更加直观的方式解释(我的理解方式)。我将其分为四个步骤:

假设更新介于A和B之间,具有V,并且查询是针对任何索引<=X的前缀查询

第一个范围更新/点查询树(T1)

首先是一个简单的范围更新/点查询树。当您使用V将A更新到B时,在实践中,您将V添加到A位置,因此任何前缀查询X>=A都会受到影响。然后从B + 1中删除V,因此任何查询X >= B + 1都不会看到添加到A的V。这里没有什么意外。

前缀查询到范围更新/点树

T1.sum(X) 是对此第一个树在X处的点查询。我们乐观地假设X之前的每个元素都等于X处的值。这就是为什么我们执行T1.sum(X)* X的原因。显然,这不完全正确,所以我们:

使用修改后的范围更新/点查询树修复结果(T2)

在更新范围时,我们还会更新第二棵树,以告诉我们需要修复第一个T1.sum(X)* X查询的量。此更新包括从任何查询X>= A中删除(A-1)* V。然后我们为X>=B添加回B*V。我们之所以这样做,是因为对第一棵树的查询不会返回X>= B + 1的V(因为T1.add(B + 1,-V)),因此我们需要以某种方式告诉有一个面积为(B-A+1)* V 的矩形适用于任何查询X>= B + 1。我们已经从A中删除了(A-1)* V,我们只需要将B * V添加回B + 1。

总结

update(A, B, V):
    T1.add(A, V)         # add V to any X>=A
    T1.add(B+1, -V)      # cancel previously added V from any X>=B+1

    T2.add(A, (A-1)*V)   # add a fix for (A-1)s V that didn't exist before A
    T2.add(B+1, -B*V)    # remove the fix, and add more (B-A+1)*V to any query 
                         # X>=B+1. This is really -(A-1)*V -(B-A+1)*V, but it 
                         # simplifies to -B*V

sum(X):
    return T1.sum(X)*X - T2.sum(X)

3
我见过的最好的解释。您能否解释一下这个陈述:“我们乐观地假设X之前的每个元素都等于X处的值。” 举个例子可能会更有帮助! - Wasim Thabraze

2
让我试着解释一下。
  1. Why do we need a second tree? I cannot answer this question. Strictly speaking, I cannot prove that it is impossible to solve this problem using only one binary index tree(and I have never seen such a proof anywhere).

  2. How can one come up with this method? Again, I don't know. I'm not the inventor of this algorithm. So I cannot tell why does it look exactly like this. The only thing I will try to explain is why and how this method works.

  3. To understand this algorithm better, the first we should do is to forget about how the binary index tree itself works. Let's treat it as just a black box that supports two operations: update one element and perform a range sum query in O(log n) time. We just want to use one or more such "black boxes" to build a data structure that can perform range updates and queries efficiently.

  4. We will maintain two binary index trees: T1 and T2. I will use the following notation: T.add(pos, delta) for performing a point update in a position pos by delta value and T.get(pos) for a sum [0 ... pos]. I claim that if an update function looks like this:

    void update(left, right, delta)
        T1.add(left, delta)
        T1.add(right + 1, -delta);
        T2.add(left, delta * (left - 1))
        T2.add(right + 1, -delta * right);
    

    and a range query is answered this way(for a prefix [0 ... pos]):

    int getSum(pos)
        return T1.sum(pos) * pos - T2.sum(pos)
    

    then the result is always correct.

  5. To prove its correctness, I will prove the following statement: each update changes the answer appropriately(it gives a proof by induction for all operations, because initially everything is filled with zeros and the correctness is obvious). Let's assume that we had a left, right, DELTA update and now we are performing pos query(that is, 0 ... pos sum). Let's consider 3 cases:
    i) pos < L. The update does not affect this query. The answer is correct(due to the induction hypothesis).
    ii) L <= pos <= R. This update will add DELTA * pos - (left - 1) * pos. It means that DELTA is added pos - L + 1 times. That's exactly how it should be. Thus, this case is also handled correctly.
    iii) pos > R. This update will add 0 + DELTA * right - DELTA * (left - 1). That is, DELTA is added exactly right - left + 1 times. It is correct, too.

    We have just shown the correctness of the induction step. Thus, this algorithm is correct.

  6. I have only shown how to answer [0, pos] sum queries. But answering [left, right] query is easy now: it is just getSum(right) - getSum(left - 1).

就是这样,我已经证明了这个算法的正确性。现在让我们尝试编写它并看看它是否有效(这只是一个草图,所以代码质量可能不是非常好):

#include <bits/stdc++.h>

using namespace std;

// Binary index tree.
struct BIT {
  vector<int> f;

  BIT(int n = 0) {
    f.assign(n, 0);
  }

  int get(int at) {
    int res = 0;
    for (; at >= 0; at = (at & (at + 1)) - 1)
      res += f[at];
    return res;
  }

  void upd(int at, int delta) {
    for (; at < f.size(); at = (at | (at + 1)))
      f[at] += delta;
  }
};

// A tree for range updates and queries.
struct Tree {
  BIT f1;
  BIT f2;

  Tree(int n = 0): f1(n + 1), f2(n + 1) {}

  void upd(int low, int high, int delta) {
    f1.upd(low, delta);
    f1.upd(high + 1, -delta);
    f2.upd(low, delta * (low - 1));
    f2.upd(high + 1, -delta * high);
  }

  int get(int pos) {
    return f1.get(pos) * pos - f2.get(pos);
  }

  int get(int low, int high) {
    return get(high) - (low == 0 ? 0 : get(low - 1));
  }
};

// A naive implementation.
struct DummyTree {
  vector<int> a;

  DummyTree(int n = 0): a(n) {}

  void upd(int low, int high, int delta) {
    for (int i = low; i <= high; i++)
      a[i] += delta;
  }

  int get(int low, int high) {
    int res = 0;
    for (int i = low; i <= high; i++)
      res += a[i];
    return res;
  }
};

int main() {
  ios_base::sync_with_stdio(0);
  int n = 100;
  Tree t1(n);
  DummyTree t2(n);
  for (int i = 0; i < 10000; i++) {
    int l = rand() % n;
    int r = rand() % n;
    int v = rand() % 10;
    if (l > r)
      swap(l, r);
    t1.upd(l, r, v);
    t2.upd(l, r, v);
    for (int low = 0; low < n; low++)
      for (int high = low; high < n; high++)
    assert(t1.get(low, high) == t2.get(low, high));
  }
  return 0;
}

哦,是的。我忘记了时间复杂度分析。但在这里它很简单:我们对二叉索引树进行了恒定数量的查询,因此每个查询的时间复杂度为O(log n)


你能解释一下你是怎么得到这个 T2.add(left, delta * (left - 1)) T2.add(right + 1, -delta * right); 的吗? - user3739818
@user3739818 我是怎么得到它的呢?嗯,构建算法是一个创造性的过程,所以一个公正的答案是:你可以用前缀和画一些图来看出应该是这样的。在你弄清楚了这一点之后,你可以严谨地证明它。 - kraskevich

1

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接