我正在评估在Scala中最好的数据结构来表示稀疏向量。这些稀疏向量包含索引列表和每个索引对应的一个值。我实现了一个小型基准测试,似乎表明Array[(Long, Double)]
占用的空间比2个并行数组少得多。这是正确的吗?我的基准测试是否正确?(如果我在某个地方做错了,我也不会感到惊讶)
import java.lang.management.ManagementFactory
import java.text.NumberFormat
object TestSize {
val N = 100000000
val formatter: NumberFormat = java.text.NumberFormat.getIntegerInstance
def twoParallelArrays(): Unit = {
val Z1 = Array.ofDim[Long](N)
val Z2 = Array.ofDim[Double](N)
Z1(N-1) = 1
Z2(N-1) = 1.0D
println(Z2(N-1) - Z1(N-1))
val z1 = ManagementFactory.getMemoryMXBean.getHeapMemoryUsage.getUsed
val z2 = ManagementFactory.getMemoryMXBean.getNonHeapMemoryUsage.getUsed
println(s"${formatter.format(z1)} ${formatter.format(z2)}")
}
def arrayOfTuples(): Unit = {
val Z = Array.ofDim[(Long, Double)](N)
Z(N-1) = (1, 1.0D)
println(Z(N-1)._2 - Z(N-1)._1)
val z1 = ManagementFactory.getMemoryMXBean.getHeapMemoryUsage.getUsed
val z2 = ManagementFactory.getMemoryMXBean.getNonHeapMemoryUsage.getUsed
println(s"${formatter.format(z1)} ${formatter.format(z2)}")
}
def main(args: Array[String]): Unit = {
// Comment one or the other to look at the results
//arrayOfTuples()
twoParallelArrays()
}
}
Z1
这样的名称可能会让编译器(以及普通读者)感到困惑。除了少数例外,遵循Java命名约定。 - Bob Dalgleish