如何在Real World Haskell中执行代码?

3

首先,抱歉。这是我第一次编译Haskell代码。我正在编译《Real World Haskell》第24章中的一些代码。该代码使用在另一个源文件中实现的MapReduce引擎来计算一行中单词的数量。以下是代码:

module Main where

import Control.Monad (forM_)
import Data.Int (Int64)
import qualified Data.ByteString.Lazy.Char8 as LB
import System.Environment (getArgs)

import LineChunks (chunkedReadWith)
import MapReduce (mapReduce, rnf)

lineCount :: [LB.ByteString] -> Int64
lineCount = mapReduce rdeepseq (LB.count '\n')
                      rdeepseq sum

main :: IO ()
main = do
  args <- getArgs
  forM_ args $ \path -> do
    numLines <- chunkedReadWith lineCount path
    putStrLn $ path ++ ": " ++ show numLines

这段代码编译通过,我得到了一个LineCount.exe。

现在,我应该如何使用它来计算文件中的行数?我有一个名为“test”的文件,其中包含一些测试文本。但是当我执行以下操作时:

LineCount test

在命令行中,我得到了以下结果:
Exception: test: hGetBufSome: illegal operation (handle is closed)

可能出了什么问题?

以下是另一个文件中更多的代码:

module LineChunks
    (
      chunkedReadWith
    ) where

import Control.Exception (bracket, finally)
import Control.Monad (forM, liftM)
import Control.Parallel.Strategies (NFData, rdeepseq)
import Data.Int (Int64)
import qualified Data.ByteString.Lazy.Char8 as LB
import GHC.Conc (numCapabilities)
import System.IO

data ChunkSpec = CS {
      chunkOffset :: !Int64
    , chunkLength :: !Int64
    } deriving (Eq, Show)

withChunks :: (NFData a) =>
              (FilePath -> IO [ChunkSpec])
           -> ([LB.ByteString] -> a)
           -> FilePath
           -> IO a
withChunks chunkFunc process path = do
  (chunks, handles) <- chunkedRead chunkFunc path
  let r = process chunks
  (rdeepseq r `seq` return r) `finally` mapM_ hClose handles

chunkedReadWith :: (NFData a) =>
                   ([LB.ByteString] -> a) -> FilePath -> IO a
chunkedReadWith func path =
    withChunks (lineChunks (numCapabilities * 4)) func path
{-- /snippet withChunks --}

{-- snippet chunkedRead --}
chunkedRead :: (FilePath -> IO [ChunkSpec])
            -> FilePath
            -> IO ([LB.ByteString], [Handle])
chunkedRead chunkFunc path = do
  chunks <- chunkFunc path
  liftM unzip . forM chunks $ \spec -> do
    h <- openFile path ReadMode
    hSeek h AbsoluteSeek (fromIntegral (chunkOffset spec))
    chunk <- LB.take (chunkLength spec) `liftM` LB.hGetContents h
    return (chunk, h)
{-- /snippet chunkedRead --}

{-- snippet lineChunks --}
lineChunks :: Int -> FilePath -> IO [ChunkSpec]
lineChunks numChunks path = do
  bracket (openFile path ReadMode) hClose $ \h -> do
    totalSize <- fromIntegral `liftM` hFileSize h
    let chunkSize = totalSize `div` fromIntegral numChunks
        findChunks offset = do
          let newOffset = offset + chunkSize
          hSeek h AbsoluteSeek (fromIntegral newOffset)
          let findNewline off = do
                eof <- hIsEOF h
                if eof
                  then return [CS offset (totalSize - offset)]
                  else do
                    bytes <- LB.hGet h 4096
                    case LB.elemIndex '\n' bytes of
                      Just n -> do
                        chunks@(c:_) <- findChunks (off + n + 1)
                        let coff = chunkOffset c
                        return (CS offset (coff - offset):chunks)
                      Nothing -> findNewline (off + LB.length bytes)
          findNewline newOffset
    findChunks 0
{-- /snippet lineChunks --}

-- Ensure that a series of ChunkSpecs is contiguous and
-- non-overlapping.
prop_contig (CS o l:cs@(CS o' _:_)) | o + l == o' = prop_contig cs
                                    | otherwise = False
prop_contig _ = True

听起来像是懒惰的 IO 出了问题。 - Gabriella Gonzalez
如果你没有提供chunkedReadWith,那么你不会得到太多有用的建议。 - Thomas M. DuBuisson
编辑问题以提供更多信息... - Velvet Ghost
3个回答

3

与其

LineCount < test

使用
LineCount test

解释:主函数中对getArgs的调用从命令行获取参数。使用"<"表示从标准输入读取。


抱歉,一开始我在进行“行数统计测试”。现在已经编辑了帖子以反映这个。 - Velvet Ghost

2

进入Real World Haskell伴随代码中的“ch24”目录,进行以下更改并运行

ghc -O2 --make -threaded LineCount && ./LineCount LineCount.hs

然后它应该输出:

LineCount.hs: 22

以下是必要的更改:

diff --git a/ch24/LineChunks.hs b/ch24/LineChunks.hs
index 0e82805..bda104d 100644
--- a/ch24/LineChunks.hs
+++ b/ch24/LineChunks.hs
@@ -6,7 +6,7 @@ module LineChunks

 import Control.Exception (bracket, finally)
 import Control.Monad (forM, liftM)
-import Control.Parallel.Strategies (NFData, rnf)
+import Control.DeepSeq(NFData,rnf)
 import Data.Int (Int64)
 import qualified Data.ByteString.Lazy.Char8 as LB
 import GHC.Conc (numCapabilities)
diff --git a/ch24/LineCount.hs b/ch24/LineCount.hs
index c6dd40b..46218e3 100644
--- a/ch24/LineCount.hs
+++ b/ch24/LineCount.hs
@@ -7,11 +7,11 @@ import qualified Data.ByteString.Lazy.Char8 as LB
 import System.Environment (getArgs)

 import LineChunks (chunkedReadWith)
-import MapReduce (mapReduce, rnf)
+import MapReduce (mapReduce, rdeepseq)

 lineCount :: [LB.ByteString] -> Int64
-lineCount = mapReduce rnf (LB.count '\n')
-                      rnf sum
+lineCount = mapReduce rdeepseq (LB.count '\n')
+                      rdeepseq sum

 main :: IO ()
 main = do
diff --git a/ch24/MapReduce.hs b/ch24/MapReduce.hs
index d0ff90b..87c79aa 100644
--- a/ch24/MapReduce.hs
+++ b/ch24/MapReduce.hs
@@ -3,7 +3,7 @@ module MapReduce
       mapReduce
     , simpleMapReduce
     -- exported for convenience
-    , rnf
+    , rdeepseq
     , rwhnf
     ) where

请查看此答案的先前版本,了解您遇到错误的原因。

1
这对我起了作用:

module Main where

import Control.Monad (forM_)
import Data.Int (Int64)
import qualified Data.ByteString.Lazy.Char8 as LB
import System.Environment (getArgs)

import LineChunks (chunkedReadWith)
import Control.Parallel.Strategies(rdeepseq)
import MapReduce (mapReduce)

lineCount :: [LB.ByteString] -> Int64
lineCount = mapReduce rdeepseq (LB.count '\n')
                      rdeepseq sum

lineCountFile :: FilePath -> IO Int64
lineCountFile path =   chunkedReadWith lineCount path

我将rnf更改为rdeepseq,因为rnf似乎不再在“parallel package”中。

这是该书的配套代码: http://examples.oreilly.com/9780596514983/rwh-examples2.zip


抱歉,我刚注意到你在之前的问题中已经解决了你的问题...请忘记这个。 - mnish
实际上我没有。对于混淆我感到抱歉 - 我在这里提供了错误的代码。我的意思是 - 我仍然有这个问题(在将rnf更改为rdeepseq之后出现“句柄已关闭”问题)。也就是说,之前的问题出现在这个问题之前,而不是之后。已编辑代码以反映这一点。 - Velvet Ghost
不,你不应该在函数withChunks中将rnf更改为rdeepseq。这可能会改变IO的顺序。混淆是因为rnfrdeepdeq具有不同的类型和不同的目的。尝试在Hoogle上查找rnf的定义。 - mnish
只需将 import Control.DeepSeq(rnf) 添加到您(有缺陷的)LineChunks.hs中,并将 withChunks 中的 rdeepseq 替换为 rnf - mnish

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接