欧拉计划问题14（考拉兹猜想）

Question

欧拉计划问题14（考拉兹猜想）

calgorithmcollatz

14

以下是针对正整数集合定义的迭代序列：

n -> n/2 (n为偶数) n -> 3n + 1 (n为奇数)

使用上述规则，从13开始生成以下序列：

13 40 20 10 5 16 8 4 2 1

可以看到，这个序列（从13开始到1结束）包含10个项。虽然它还没有被证明（Collatz问题），但人们认为所有起始数字最终都以1结尾。

在一百万以内，哪个起始数字产生的链最长？

注意：一旦链开始，允许项超过一百万。

我尝试使用暴力方法在C中编写了解决方案。然而，当尝试计算113383时，我的程序似乎停滞不前。请指教 :)

#include <stdio.h>
#define LIMIT 1000000

int iteration(int value)
{
 if(value%2==0)
  return (value/2);
 else
  return (3*value+1);
}

int count_iterations(int value)
{
 int count=1;
 //printf("%d\n", value);
 while(value!=1)
 {
  value=iteration(value);
  //printf("%d\n", value);
  count++;
 }
 return count;
}

int main()
{
 int iteration_count=0, max=0;
 int i,count;


 for (i=1; i<LIMIT; i++)
 {
  printf("Current iteration : %d\n", i);
  iteration_count=count_iterations(i);
  if (iteration_count>max)
   {
   max=iteration_count;
   count=i;
   }

 }

 //iteration_count=count_iterations(113383); 
 printf("Count = %d\ni = %d\n",max,count);

}

- paradox

暴力破解对我有效。为什么？ - Pratik Deoghare

请查看：https://dev59.com/l3E95IYBdhLWcg3wQ7qc - Jonathan Leffler

9

113383需要247步才能完成，它生成的最大数是2482111348。这个数字超出了有符号32位整数的范围（因此会得到负数和其他问题）。 - Jonathan Leffler

生成的以1百万为起始值的最大数字仅略低于570亿。步骤最多的次数是524;虽然它与最大数字无关。我知道起始数字是什么，但你也应该能找到。（使用“bc”，生成值列表“start = X，max = Y，count = Z”不到10分钟，确保没有多余的输出。原始C应该可以轻松完成这个时间。但是，“bc”将毫不费力地处理超级荒谬的大数字；编写处理大于64位整数的C代码并不容易。） - Jonathan Leffler

在2015年IOCCC比赛中，有一项获胜作品可以计算具有任意大值（也称为bignums）的Collatz序列。请搜索schweikhardt。多么无耻的广告 :-) - Jens

8个回答

9

请注意，您的暴力解决方案经常会重复计算相同的子问题。例如，如果您从10开始，您会得到5 16 8 4 2 1;但如果您从20开始，您会得到20 10 5 16 8 4 2 1。如果您在计算10时缓存该值，则不必再次计算。

（这被称为动态规划。）

- Jesse Beder

我理解这个概念，但不太确定如何将其转化为代码。也许需要一个查找表？ - paradox

@Nick 我不同意“正确术语是”的说法。请看维基百科文章。记忆化只是可以在DP中使用的一种方法。 - user166390

1

这不是他挂掉的原因，原因是 int 不够大（请参见下面得票较少的答案）。 - Motti

@pst，DP利用递归和记忆化技术来寻找最优解。 - Nick Dandoulakis

@Nick，我在术语上采取了更广泛的方法。是的，缓存被称为记忆化，但它们都属于动态规划的范畴，这应该是他应该学习的。（当然，我完全错过了32位int业务，那才是实际的问题。）这也引出了一个问题，你是否应该发布答案然后立即去睡觉 :) - Jesse Beder

1

我一段时间前解决了这个问题，幸运的是我还保留着我的代码。如果你不想看到剧透，请不要阅读代码：

#include <stdio.h>

int lookup[1000000] = { 0 };

unsigned int NextNumber(unsigned int value) {
  if ((value % 2) == 0) value >>= 1;
  else value = (value * 3) + 1;
  return value;
}

int main() {
  int i = 0;
  int chainlength = 0;
  int longest = 0;
  int longestchain = 0;
  unsigned int value = 0;
  for (i = 1; i < 1000000; ++i) {
    chainlength = 0;
    value = i;
    while (value != 1) {
      ++chainlength;
      value = NextNumber(value);
      if (value >= 1000000) continue;
      if (lookup[value] != 0) {
        chainlength += lookup[value];
        break;
      }
    }

    lookup[i] = chainlength;

    if (longestchain < chainlength) {
      longest = i;
      longestchain = chainlength;
    }
  }
  printf("\n%d: %d\n", longest, longestchain);
}

time ./a.out

[don't be lazy, run it yourself]: [same here]

real    0m0.106s
user    0m0.094s
sys     0m0.012s

- user253984

值 >>= 1 这是左移一位并赋值吗？ - paradox

= 1 是除以2的操作。只写 /= 2 就足够了，编译器会自动进行优化，这是我过早优化的简单表现。

- user253984

我注意到你的代码中有一个错误。Chainlength 应该从1开始而不是0。 - paradox

谢谢你指出来。无论如何，由于问题要求最高的数字，而不是最多的步骤，结果仍然相同，只是步骤数可能会少一些（而真正的程序员始终从0开始计数;-)）。 - user253984

我尝试了一个类似的程序，但最初使用哈希而不是数组。结果...它没有起作用。只是一直在运行。 - Tim Potapov

1

在C#中刚刚测试过，似乎113383是第一个值，32位的int类型变得太小，无法存储链中的每个步骤。

尝试使用unsigned long来处理这些大数值吧 ;)

- Thibault Falise

真的，但请注意在C语言中long可能与int相同，应该使用long long。 - Motti

@Motti：uintmax_t 似乎是一个合理的选择。 - Jonathan Leffler

@Jonathan，我对uintmax_t不太熟悉。我看到它是C99的一部分，这可能是原因，因为我使用的是不定义它的C++。 - Motti

1

@Motti：说得好 - 但是C++标准也没有定义long long :) - Jonathan Leffler

0

正如已经提到的，最简单的方法是使用一些记忆化技术，避免重新计算尚未计算过的内容。你可能会感兴趣地知道，如果你从一个小于一百万的数字开始（目前还没有发现循环，并且人们已经探索了更大的数字），那么就不会出现循环。

要将其转化为代码，你可以考虑使用Python的方式：

MEMOIZER = dict()

def memo(x, func):
  global MEMOIZER
  if x in MEMOIZER: return MEMOIZER[x]

  r = func(x)

  MEMOIZER[x] = r

  return r

记忆化是一种非常通用的方案。

对于Collatz猜想来说，由于数字可以变得非常大，因此您可能会受到一些困扰，并且可能会耗尽可用内存。

传统上使用缓存处理这个问题，您只需要缓存最后的n个结果（量身定制占用给定数量的内存），当您已经缓存了n个项目并希望添加新的项目时，您就会丢弃旧的那个。

对于这个猜想可能还有另一种可用的策略，虽然实现起来有点困难。基本思想是您只有两种方法可以到达给定数字x：

从2*x
从(x-1)/3

因此，如果您缓存2*x和(x-1)/3的结果，则没有必要再缓存x了，因为它不会再被调用（除非您希望在最后打印序列...但这仅会发生一次）。我让您自己利用这一点，使您的缓存不会太大 :)

- Matthieu M.

0

我在C#中的努力，运行时间<1秒，使用LinqPad:

var cache = new Dictionary<long, long>();
long highestcount = 0;
long highestvalue = 0;
for (long a = 1; a < 1000000; a++)
{
    long count = 0;
    long i = a;
    while (i != 1)
    {
        long cachedCount = 0;
        if (cache.TryGetValue(i, out cachedCount)) //See if current value has already had number of steps counted & stored in cache
        {
            count += cachedCount; //Current value found, return cached count for this value plus number of steps counted in current loop
            break;
        }
        if (i % 2 == 0)
            i = i / 2;
        else
            i = (3 * i) + 1;
        count++;
    }
    cache.Add(a, count); //Store number of steps counted for current value
    if (count > highestcount)
    {
        highestvalue = a;
        highestcount = count;
    }
}
Console.WriteLine("Starting number:" + highestvalue.ToString() + ", terms:" + highestcount.ToString());

- Paul

0

解决原问题中的 unsigned int 问题。

添加用于存储预计算值的数组。

include <stdio.h>                                                                                                                                     
#define LIMIT 1000000

unsigned int dp_array[LIMIT+1];

unsigned int iteration(unsigned int value)
{
 if(value%2==0)
  return (value/2);
 else
  return (3*value+1);
}

unsigned int count_iterations(unsigned int value)
{
 int count=1;
 while(value!=1)
 {
 if ((value<=LIMIT) && (dp_array[value]!=0)){
   count+= (dp_array[value] -1);
   break;
  } else {
   value=iteration(value);
   count++;
  }
 }
 return count;
}

int main()
{
 int iteration_count=0, max=0;
 int i,count;
 for(i=0;i<=LIMIT;i++){
  dp_array[i]=0;
 }

 for (i=1; i<LIMIT; i++)
 {
//  printf("Current iteration : %d \t", i);
  iteration_count=count_iterations(i);
  dp_array[i]=iteration_count;
//  printf(" %d \t", iteration_count);
  if (iteration_count>max)
   {
   max=iteration_count;
   count=i;
   }
//  printf(" %d \n", max);

 }

 printf("Count = %d\ni = %d\n",max,count);

}

输出： Count = 525 i = 837799

- Sayak Rana

-1

Haskell解决方案，运行时间2秒。

thomashartman@yucca:~/collatz>ghc -O3 -fforce-recomp --make collatz.hs
[1 of 1] Compiling Main             ( collatz.hs, collatz.o )
Linking collatz ...
thomashartman@yucca:~/collatz>time ./collatz
SPOILER REDACTED
real    0m2.881s

-- 或许如果使用哈希表而不是映射，我可能会更快地得到它。

import qualified Data.Map as M
import Control.Monad.State.Strict
import Data.List (maximumBy)
import Data.Function (on)

nextCollatz :: Integer -> Integer
nextCollatz n | even n = n `div` 2
               | otherwise = 3 * n + 1

newtype CollatzLength = CollatzLength Integer
  deriving (Read,Show,Eq,Ord)

main = print longestCollatzSequenceUnderAMill 
longestCollatzSequenceUnderAMill = longestCollatzLength [1..1000000]
-- sanity checks
tCollatzLengthNaive = CollatzLength 10 == collatzLengthNaive 13 
tCollatzLengthMemoized = (CollatzLength 10) == evalState (collatzLengthMemoized 13) M.empty

-- theoretically could be nonterminating. Since we're not in Agda, we'll not worry about it.
collatzLengthNaive :: Integer -> CollatzLength
collatzLengthNaive 1 = CollatzLength 1
collatzLengthNaive n = let CollatzLength nextLength = collatzLengthNaive (nextCollatz n)
                       in  CollatzLength $ 1 + nextLength

-- maybe it would be better to use hash here?
type CollatzLengthDb = M.Map Integer CollatzLength
type CollatzLengthState = State CollatzLengthDb 

-- handy for testing
cLM :: Integer -> CollatzLength
cLM n = flip evalState M.empty $ (collatzLengthMemoized n) 

collatzLengthMemoized :: Integer -> CollatzLengthState CollatzLength
collatzLengthMemoized 1 = return $ CollatzLength 1
collatzLengthMemoized n = do
  lengthsdb <- get
  case M.lookup n lengthsdb of 
    Nothing -> do let n' = nextCollatz n 
                  CollatzLength lengthN' <- collatzLengthMemoized n'
                  put $ M.insert n' (CollatzLength lengthN') lengthsdb
                  return $ CollatzLength $ lengthN' + 1
    Just lengthN -> return lengthN

longestCollatzLength :: [Integer] -> (Integer,CollatzLength)
longestCollatzLength xs = flip evalState M.empty $ do 
  foldM f (1,CollatzLength 1) xs
  where f maxSoFar@(maxN,lengthMaxN) nextN = do
          lengthNextN <- collatzLengthMemoized nextN
          let newMaxCandidate = (nextN,lengthNextN)
          return $ maximumBy (compare `on` snd) [maxSoFar, newMaxCandidate]

================================================================================

这里有另一个使用monad-memo包的Haskell解决方案。不幸的是，这个解决方案存在堆栈空间错误，但不会影响上面手动编写的记忆化程序。

./collatzMemo +RTS -K83886080 -RTS # 这将产生答案，但最好消除空间泄漏

{-# Language GADTs, TypeOperators #-} 

import Control.Monad.Memo
import Data.List (maximumBy)
import Data.Function (on)

nextCollatz :: Integer -> Integer
nextCollatz n | even n = n `div` 2
               | otherwise = 3 * n + 1

newtype CollatzLength = CollatzLength Integer
  deriving (Read,Show,Eq,Ord)

main = print longestCollatzSequenceUnderAMill 
longestCollatzSequenceUnderAMill = longestCollatzLength [1..1000000]

collatzLengthMemoized :: Integer -> Memo Integer CollatzLength CollatzLength
collatzLengthMemoized 1 = return $ CollatzLength 1
collatzLengthMemoized n = do
  CollatzLength nextLength <- memo collatzLengthMemoized (nextCollatz n)
  return $ CollatzLength $ 1 + nextLength 
{- Stack space error
./collatzMemo
Stack space overflow: current size 8388608 bytes.
Use `+RTS -Ksize -RTS' to increase it.

Stack error does not effect rolled-my-own memoizer at
https://dev59.com/SnE85IYBdhLWcg3wtV_1
-}
longestCollatzLength :: [Integer] -> (Integer,CollatzLength)
longestCollatzLength xs = startEvalMemo $ do
  foldM f (1,CollatzLength 1) xs
  where f maxSoFar nextN = do
          lengthNextN <- collatzLengthMemoized nextN
          let newMaxCandidate = (nextN,lengthNextN)
          return $ maximumBy (compare `on` snd) [maxSoFar, newMaxCandidate]

{-
-- sanity checks
tCollatzLengthNaive = CollatzLength 10 == collatzLengthNaive 13 
tCollatzLengthMemoized = (CollatzLength 10) ==startEvalMemo (collatzLengthMemoized 13) 

-- theoretically could be nonterminating. Since we're not in Agda, we'll not worry about it.
collatzLengthNaive :: Integer -> CollatzLength
collatzLengthNaive 1 = CollatzLength 1
collatzLengthNaive n = let CollatzLength nextLength = collatzLengthNaive (nextCollatz n)
                       in  CollatzLength $ 1 + nextLength
-}

==================================================

另一个，因式分解得更好。速度不如之前快，但仍然在一分钟以内。

import qualified Data.Map as M
import Control.Monad.State
import Data.List (maximumBy, nubBy)
import Data.Function (on)

nextCollatz :: Integer -> Integer
nextCollatz n | even n = n `div` 2
               | otherwise = 3 * n + 1

newtype CollatzLength = CollatzLength Integer
  deriving (Read,Show,Eq,Ord)

main = print longestCollatzSequenceUnderAMillStreamy -- AllAtOnce                                                                                                                                                                                                         

collatzes = evalState collatzesM M.empty
longestCollatzSequenceUnderAMillAllAtOnce = winners . takeWhile ((<=1000000) .fst) $ collatzes
longestCollatzSequenceUnderAMillStreamy = takeWhile ((<=1000000) .fst) . winners  $ collatzes


-- sanity checks                                                                                                                                                                                                                                                          
tCollatzLengthNaive = CollatzLength 10 == collatzLengthNaive 13
tCollatzLengthMemoized = (CollatzLength 10) == evalState (collatzLengthMemoized 13) M.empty

-- maybe it would be better to use hash here?                                                                                                                                                                                                                             
type CollatzLengthDb = M.Map Integer CollatzLength
type CollatzLengthState = State CollatzLengthDb

collatzLengthMemoized :: Integer -> CollatzLengthState CollatzLength
collatzLengthMemoized 1 = return $ CollatzLength 1
collatzLengthMemoized n = do
  lengthsdb <- get
  case M.lookup n lengthsdb of
    Nothing -> do let n' = nextCollatz n
                  CollatzLength lengthN' <- collatzLengthMemoized n'
                  put $ M.insert n' (CollatzLength lengthN') lengthsdb
                  return $ CollatzLength $ lengthN' + 1
    Just lengthN -> return lengthN

collatzesM :: CollatzLengthState [(Integer,CollatzLength)]
collatzesM = mapM (\x -> do (CollatzLength l) <- collatzLengthMemoized x
                            return (x,(CollatzLength l)) ) [1..]

winners :: Ord b => [(a, b)] -> [(a, b)]
winners xs = (nubBy ( (==) `on` snd )) $ scanl1 (maxBy snd) xs

maxBy :: Ord b => (a -> b) -> a -> a -> a
maxBy f x y = if f x > f y then x else y

- tphyahoo

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Motti · Accepted Answer

你卡住的原因是因为你经过了大于2^31-1（又称INT_MAX）的数字；请尝试使用unsigned long long而不是int。

我最近在博客中写到这个问题；请注意，在C语言中，朴素迭代方法已经足够快了。对于动态语言，您可能需要通过记忆化来进行优化，以遵守“一分钟规则”（但在这里并非如此）。

哎呀，我又犯了同样的错误（这次使用C++进一步研究可能的优化方法）。