在Erlang中实现高效的二分查找

Question

在Erlang中实现高效的二分查找

4

提示：这是我在 Stack Overflow 上的第一篇帖子，如果我做错了什么，请让我知道，我会尝试修复。我已经花了一整天时间来修复这个算法实现，但是我已经无能为力了！

我正在上一门课程，要求我们用三种截然不同的语言实现相同的二分查找算法。对于其中一个，我决定走弯路，尝试使用 Erlang。下面是我的代码：

-export(main/1).

main(_) ->
    io:format("Running trials... ~n", []),
    N = [2 bsl (2 * X + 6) || X <- lists:seq(0, 7)],
    lists:foreach(fun(X) -> io:format("~p: ~p ~n", [X, time_search(X)]) end, N).

time_search(Size) ->
    A = list_to_tuple(create_list(Size)),
    Iterations = 2000000,
    time_search(A, Size + 1, 0, Iterations) / Iterations.

time_search(_, _, Sum, 0) -> Sum;
time_search(A, Key, Sum, IterationNum) ->
    Before = now(),
    bin_search(A, Key),
    time_search(A, Key, Sum + timer:now_diff(now(), Before), IterationNum - 1).

create_list(Size) -> lists:seq(1, Size).

bin_search(A, Key) -> bin_search(A, Key, 1, tuple_size(A)).

bin_search(_, _, Lower, Upper) when Lower > Upper -> false;
bin_search(A, Key, Lower, Upper) ->
    Mid = (Upper + Lower) div 2,
    Item = element(Mid, A),
    if
        Key > Item -> bin_search(A, Key, Mid + 1, Upper);
        Key < Item -> bin_search(A, Key, Lower, Mid - 1);
        true -> true
    end.

所以这是输出结果：

128: 250.8094435 µs
512: 308.7264845 µs
2048: 368.5770285 µs
8192: 425.748134 µs
32768: 477.6267655 µs
131072: 533.340876 µs
524288: 601.023178 µs
2097152: 661.099404 µs

与我的ruby实现相比，它显然远远低于O（lg n）

编辑：好吧，也许不完全是orders of magnitude ... 它似乎相当对数，但仍然感觉不对...

Ruby：

128: 10.4756 µs
512: 11.8172 µs
2048: 13.5328 µs
8192: 15.3139 µs
32768: 17.0915 µs
131072: 18.8721 µs
524288: 21.5237 µs
2097152: 21.7187 µs

之前我在使用Lists，但了解到它们没有O(1)的检索时间，因此我转而使用Tuples。是什么导致我的Erlang运行效率如此低下？

为了避免疑惑，这里提供一下我的Ruby实现:

def p5_BinarySearch
  n = Array.new(8){ |i| 2 << (2 * i + 6) }
  n.each { |x|
    time = timeSearch x
    puts "#{x}: #{time.round 4} µs"
  }
end

def timeSearch(size)
  values = createArray size
  iterations = 2e7.to_int
  totalTime = 0
  iterations.times{
    before = Time.now
    binarySearch(values, size + 1)
    totalTime += Time.now - before
  }
  totalTime * 1e7 / iterations
end

def createArray(size)
  (1..size).to_a
end

def binarySearch(values, key, low = 0, high = values.length - 1)
  return false if low > high
  mid = (low + high) / 2
  return binarySearch(values, key, mid + 1, high) if key > values[mid]
  return binarySearch(values, key, low, mid - 1) if key < values[mid]
  true
end

if __FILE__ == $0
  puts "Running trials..."
  p5_BinarySearch
end

- Connor Clark

2

你是如何运行Erlang测试的？作为一个escript吗？ - Lukas

是的，我只是在命令行中使用escript <文件名>来运行它。 - Connor Clark

2

Escript解释代码。将代码保存在.erl文件中，编译并从shell中运行。不要忘记添加模块声明。导出最好使用-export([main/1])。.erl文件必须与模块名称相同。 - rvirding

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Vinod · Accepted Answer

算法看起来不错，但在这种情况下使用escript解释器是个问题。使用escript -c运行escript，这样脚本在运行之前会被编译而不是解释，或者可以像rvirding建议的那样创建一个Erlang模块。通过这种方式，您可以看到它执行得非常快。

另外最好使用os:timestamp()而不是使用now()来获取准确的时间值。更准确的方法是使用timer:tc函数。

{Time,_} = timer:tc(fun bin_search/2, [A, Key]),
time_search(A, Key, Sum + Time, IterationNum - 1).

在我的机器上，结果大致如此。（没有 -c 大约需要 450 微秒）

128: 1.105
512: 1.2005
2048: 1.4015
8192: 1.5145
32768: 1.7225
131072: 1.8515
524288: 2.024
2097152: 2.188