.NET优化 F# 埃拉托色尼筛法

Question

.NET优化 F# 埃拉托色尼筛法

.netoptimizationf#sieve-of-eratosthenesf#-interactive

3

我正在使用F#进行实验，并使用FSI REPL。我注意到，在我的初学者的朴素埃拉托色尼筛法实现中，两个略有不同的实现之间存在巨大的效率差异。第一个实现增加了一个if语句：

let rec sieve max current pList =
    match current with
    | 2 -> sieve max (current + 1) (current::pList)
    | 3 -> sieve max (current + 2) (current::pList)
    | n when n < max ->
        if (n % 5 = 0) || (n % 3 = 0) then
            sieve max (current + 2) (current::pList)
        else if (pList |> (List.map (fun n -> current % n = 0)) |> List.contains true) then
            sieve max (current + 2) (pList)
        else
            sieve max (current + 2) (current::pList)
    | n when n >= max
        -> pList
    | _
        ->  printfn "Error: list length: %A, current: %A" (List.length pList) current
            [-1]

第一个包含 HTML 标记，第二个不包含：

let rec sieve max current pList =
    match current with
    | 2 -> sieve max (current + 1) (current::pList)
    | 3 -> sieve max (current + 2) (current::pList)
    | n when n < max ->
        if (pList |> (List.map (fun n -> current % n = 0)) |> List.contains true) then
            sieve max (current + 2) (pList)
        else
            sieve max (current + 2) (current::pList)
    | n when n >= max
        -> pList
    | _
        ->  printfn "Error: list length: %A, current: %A" (List.length pList) current
            [-1]

拥有额外if分支的实现实际上速度较慢，尽管看起来应该更快。我在REPL中使用以下命令计时了这两种实现：

#time

sieve 200000 2 []

#time

并且发现在我的机器上，具有额外if语句的实现需要大约2分钟，而没有if语句的实现每次运行大约需要1分钟。这是如何可能的？通过添加一个负责3或5的倍数的if语句，它实际上比仅映射整个列表，然后查找是否存在质数列表中的数字除数要慢。为什么？只是因为F#针对处理列表进行了优化吗？

- LSM07

3个回答

2

额外的if会进行额外的计算，但不会打断执行流程，程序会继续执行第二个if。因此，实际上您只是在函数中添加了一些无用的计算。难怪现在需要更长时间！您可能会想到以下代码：

if (a)
    return b;
if (x)
    return y;
else 
    return z;

这在F#中的运作方式与C＃、Java或您所想象的其他语言不同。F#没有“早期返回”。没有“语句”，一切都是表达式，一切都有结果。

添加这样无用的计算实际上会生成警告。如果您注意警告，您应该会注意到其中一个说“此值正在被丢弃”之类的内容。编译器试图通过指向无用的函数调用来帮助您。

要修复此问题，请使用elif替换第二个if：

if (n % 5 = 0) || (n % 3 = 0) then
    sieve max (current + 2) (current::pList)
elif (pList |> (List.map (fun n -> current % n = 0)) |> List.contains true) then
    sieve max (current + 2) (pList)
else
    sieve max (current + 2) (current::pList)

当第一个分支失败时，这将使第二个分支执行。

P.S. 想想看，这样一个没有else的if甚至不应该编译，因为它的结果类型无法确定。我不确定那里发生了什么。

P.P.S. List.map f |> List.contains true 最好表达为 List.exists f。更短更高效。

- Fyodor Soikin

实际上那是一个复制粘贴错误。我已经纠正了，以展示两个版本。 - LSM07

哦，好的... :-) - Fyodor Soikin

0

当然，列表并不一定高效。我曾经创建了一个函数来创建一个布尔数组，其中每个质数为true，每个非质数为false：

let sieveArray limit =
    let a = Array.create limit true
    let rec setArray l h s x =
        if l <= h then
            a.[l] <- x
            setArray (l + s) h s x
    a.[0] <- false; a.[1] <- false
    for i = 0 to a.Length - 1 do
        if a.[i]
        then setArray (i + i) (a.Length - 1) i false
    a

要获取实际质数列表，您可以映射索引，过滤结果数组：

sieveArray limit
|> Seq.mapi (fun i x -> i, x)
|> Seq.filter snd
|> Seq.map fst
|> Seq.toList

- dumetrulo

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Alex Netkachov · Accepted Answer

第一个筛法中多余的if语句，可能被认为是一种捷径，但实际上会改变其表现。它并不会剔除所有能被3和5整除的数，而是将其添加进结果中。通过比较输出结果，这一点很容易看出：

1st sieve: [19; 17; 15; 13; 11; 9; 7; 5; 3; 2]
2st sieve: [19; 17; 13; 11; 7; 5; 3; 2]

我猜您想要的是这样的内容：

我假设您想要的是这样的东西：

if (n % 5 = 0) || (n % 3 = 0) then
    sieve max (current + 2) (pList)

然而，在此情况下，它不会包括 5（显然）。因此正确的代码是：

if (n <> 5 && n % 5 = 0) || (n <> 3 && n % 3 = 0) then
    sieve max (current + 2) (pList)

请检查上面代码的性能 - 应该没问题。