这个还挺有趣的。
case class Coord(x: Double, y: Double) {
def dist(c: Coord) = Math.sqrt(Math.pow(x - c.x, 2) + Math.pow(y - c.y, 2))
}
class CoordOrdering(x: Coord) extends Ordering[Coord] {
def compare(a: Coord, b: Coord) = a.dist(x) compare b.dist(x)
}
def top[T](xs: Seq[T], n: Int)(implicit ord: Ordering[T]): Seq[T] = {
def insert[T](xs: Seq[T], e: T)(implicit ord: Ordering[T]): Seq[T] = {
val (l, r) = xs.span(x => ord.lt(x, e))
(l ++ (e +: r)).take(n)
}
xs.drop(n).foldLeft(xs.take(n).sorted)(insert)
}
测试不充分。像这样调用它:
val grid = (1 to 250000).map { _ => Coord(Math.random * 5, Math.random * 5) }
val x = Coord(Math.random * 5, Math.random * 5)
top(grid, 3)(new CoordOrdering(x))
编辑:将这个方法扩展到(预)计算距离只需要很简单的操作。
val zippedGrid = grid map {_.dist(x)} zip grid
object ZippedCoordOrdering extends Ordering[(Double, Coord)] {
def compare(a:(Double, Coord), b:(Double, Coord)) = a._1 compare b._1
}
top(zippedGrid,3)(ZippedCoordOrdering).unzip._2
def distSquare(c: coord) = Math.pow(x-c.x, 2) + Math.pow(y-c.y, 2)
作为度量方式。这基本上可以避免每次计算.sqrt
。 - artur grzesiaktop
方法可以实现您想要的功能。也许您可以重新利用该源代码?https://spark.apache.org/docs/0.8.1/api/core/org/apache/spark/rdd/RDD.html - The Archetypal Paul