最近点对问题的竖直扫描算法

Question

最近点对问题的竖直扫描算法

4

标准的最近点对扫描线算法被广泛采用，具体描述可以参照这里。该算法使用一条扫描线在点集上水平扫过，只保留当前距离最近的点。

通常情况下，需要按照x坐标进行初始排序，并且边界框（在C++实现中为std::set）需要按y坐标进行排序，如此C++实现所示。

然而，在我的实现中，由于疏忽，我不小心按照y坐标进行了初始排序。出人意料的是，这种方式似乎仍然有效。你可以在这里看到我的实现，它本质上遵循了标准的最近点对扫描线算法的一个略微修改的版本。

#include <iostream>
#include <set>
#include <algorithm>
#include <math.h>
#include <vector>

using namespace std;

#define x second
#define y first

typedef pair<long long, long long> pll;

inline double dist(pll p1, pll p2)
{
    return sqrt((double) (p2.y - p1.y)*(p2.y - p1.y) + (p2.x - p1.x)*(p2.x - p1.x));
}


int main(int argc, const char * argv[])
{
    int numPoints;
    cin >> numPoints;

    vector <pll> points;
    points.resize(numPoints);

    for (int i = 0; i < numPoints; i++)
    {
        cin >> points[i].x >> points[i].y;
    }

    //Sorts the points by y coordinate (because y is first)
    sort(points.begin(), points.end());

    double shortestDistSoFar = INFINITY;
    set <pll> boundingBox; //Bounding box maintained by y-coordinate
    boundingBox.insert(points[0]);

    int left = 0;

    pll best1, best2;

    for (int i = 1; i < numPoints; i++)
    {
        //Maintain only points to the left of the current point whose distance is less than bestDist
        while ((left < i) && (points[i].x - points[left].x > shortestDistSoFar))
        {
            boundingBox.erase(points[left]);
            left++;
        }

        //Consider only points within bestDist of the current point
        for (auto it = boundingBox.lower_bound(pll(points[i].y - shortestDistSoFar, points[i].x - shortestDistSoFar));
             it != boundingBox.end() && it->y <= points[i].y + shortestDistSoFar; it++)
        {
            if (dist(*it, points[i]) < shortestDistSoFar)
            {
                shortestDistSoFar = dist(*it, points[i]);
                best1 = *it;
                best2 = points[i];
            }
        }

        boundingBox.insert(points[i]);
    }

    return 0;
}

按递增y坐标的顺序访问每个点，并对于每个点，从y-bestDist到y+bestDist检查所有点，在发现新的最短距离时更新bestDist，并从集合中删除x坐标距离当前点大于bestDist的点。

这个修改后的算法仍然有效（我只测试了几个案例），运行时间仍为O(N lgN)吗？

- 1110101001

1

从你的距离函数中移除sqrt，它是不必要且昂贵的。 - tclamb

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- kraskevich · Accepted Answer

它是正确的(因为只有当点能够安全删除时才会从集合中删除)。然而，它的时间复杂度为O(n ^ 2)，因为并不总是在应该删除它们的时候删除点。

这个简单的生成器(用Python3编写)：

from sys import argv

n = int(argv[1])
dy = 1
dx = -n
print(n)
for i in range(n):
  print(dx * i, dy * i)

创建一个测试用例，使你的代码对于任何n执行O(n ^ 2)次操作。

这个测试用例的思路非常简单：如果按y坐标排序后迭代点（按x坐标递减顺序），则它们永远不会从集合中删除。因此，如果它们在y坐标上彼此靠近而在x坐标上相距较远，则每次都要遍历整个集合。这就是为什么要检查所有点对（正好有n * (n - 1) / 2个点对）。

现在让我们看看它在实践中的效果：首先，我使用“g++ -std=c++11 -O2 closest_pair.cpp”编译了您的代码。之后，我运行了一堆测试：

temp$ python3 gen.py 10000 > input
temp$ time ./a.out < input

real    0m0.805s
user    0m0.797s
sys     0m0.008s

temp$ python3 gen.py 30000 > input
temp$ time ./a.out < input

real    0m7.195s
user    0m7.198s
sys     0m0.004s

temp$ python3 gen.py 50000 > input
temp$ time ./a.out < input

real    0m23.711s
user    0m23.725s
sys     0m0.004s

正如您所看到的，它的工作速度非常缓慢。