在Haskell中处理IO和纯代码

Question

在Haskell中处理IO和纯代码

10

我正在编写一个shell脚本，用于列出目录、获取每个文件的大小、进行一些字符串操作（纯代码）然后重命名一些文件。我不确定我做错了什么，所以有两个问题：

这种程序应该如何安排代码？
我遇到了一个特定的问题，我收到以下错误信息，我做错了什么？

error:
    Couldn't match expected type `[FilePath]'
           against inferred type `IO [FilePath]'
    In the second argument of `mapM', namely `fileNames'
    In a stmt of a 'do' expression:
        files <- (mapM getFileNameAndSize fileNames)
    In the expression:
        do { fileNames <- getDirectoryContents;
             files <- (mapM getFileNameAndSize fileNames);
             sortBy cmpFilesBySize files }

代码：

getFileNameAndSize fname = do (fname,  (withFile fname ReadMode hFileSize))

getFilesWithSizes = do
  fileNames <- getDirectoryContents
  files <- (mapM getFileNameAndSize fileNames)
  sortBy cmpFilesBySize files

- Drakosha

2个回答

10

getDirectoryContents是一个函数。您需要向它提供一个参数，例如：

fileNames <- getDirectoryContents "/usr/bin"

此外，从ghci中可以看出，getFileNameAndSize的类型为FilePath -> (FilePath, IO Integer)。

Prelude> :m + System.IO
Prelude System.IO> let getFileNameAndSize fname = do (fname, (withFile fname ReadMode hFileSize))
Prelude System.IO> :t getFileNameAndSize
getFileNameAndSize :: FilePath -> (FilePath, IO Integer)

但是，mapM 要求输入函数返回一个 IO stuff：

Prelude System.IO> :t mapM
mapM :: (Monad m) => (a -> m b) -> [a] -> m [b]
-- #                  ^^^^^^^^

您应该将其类型更改为FilePath -> IO（FilePath，Integer）以匹配类型。

getFileNameAndSize fname = do
  fsize <- withFile fname ReadMode hFileSize
  return (fname, fsize)

- kennytm

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Antal Spector-Zabusky · Accepted Answer

你的第二个具体问题与你的函数类型有关。然而，你的第一个问题（不是真正的类型问题）是getFileNameAndSize函数中的do语句。虽然do在单子中被使用，但它并不是单子的万能药；它实际上是通过一些简单的转换规则来实现的。简单概括一下（由于涉及到错误处理的一些细节，这并不是完全正确的，但足够接近了）：

do a ≡ a
do a ; b ; c ... ≡ a >> do b ; c ...
do x <- a ; b ; c ... ≡ a >>= \x -> do b ; c ...

换句话说，getFileNameAndSize函数等价于没有do块的版本，所以你可以去掉do。这样就只剩下：

getFileNameAndSize fname = (fname, withFile fname ReadMode hFileSize)

我们可以找到这个函数的类型：由于fname是withFile的第一个参数，它的类型为FilePath；而hFileSize返回IO Integer，因此withFile ...的类型也是IO Integer。因此，我们有getFileNameAndSize :: FilePath -> (FilePath, IO Integer)。这可能是您想要的，也可能不是；您可能希望得到FilePath -> IO (FilePath,Integer)。要更改它，您可以编写以下任何内容：

getFileNameAndSize_do    fname = do size <- withFile fname ReadMode hFileSize
                                    return (fname, size)
getFileNameAndSize_fmap  fname = fmap ((,) fname) $
                                      withFile fname ReadMode hFileSize
-- With `import Control.Applicative ((<$>))`, which is a synonym for fmap.
getFileNameAndSize_fmap2 fname =     ((,) fname)
                                 <$> withFile fname ReadMode hFileSize
-- With {-# LANGUAGE TupleSections #-} at the top of the file
getFileNameAndSize_ts    fname = (fname,) <$> withFile fname ReadMode hFileSize

接下来，正如KennyTM所指出的那样，你有fileNames <- getDirectoryContents；因为getDirectoryContents的类型是FilePath -> IO FilePath，所以你需要给它一个参数（例如getFilesWithSizes dir = do fileNames <- getDirectoryContents dir ...），这可能只是一个简单的疏忽。

接下来，我们来到了你的错误核心：files <- (mapM getFileNameAndSize fileNames)。我不确定它为什么会给你精确的错误信息，但我可以告诉你出了什么问题。记住我们对getFileNameAndSize的了解。在你的代码中，它返回一个(FilePath, IO Integer)。然而，mapM的类型是Monad m => (a -> m b) -> [a] -> m [b]，所以mapM getFileNameAndSize是类型不匹配的。你想要的是getFileNameAndSize :: FilePath -> IO (FilePath,Integer)，就像我上面实现的那样。

最后，我们需要修复你的最后一行。首先，虽然你没有提供它，但cmpFilesBySize很可能是一个比较第二个元素类型为(FilePath, Integer) -> (FilePath, Integer) -> Ordering的函数。这很简单，你可以使用Data.Ord.comparing :: Ord a => (b -> a) -> b -> b -> Ordering，写成comparing snd，它的类型是Ord b => (a, b) -> (a, b) -> Ordering。其次，你需要将结果包装在IO单子中返回，而不仅仅是一个普通的列表；函数return :: Monad m => a -> m a就能解决问题。

因此，把所有这些放在一起，你会得到：

import System.IO           (FilePath, withFile, IOMode(ReadMode), hFileSize)
import System.Directory    (getDirectoryContents)
import Control.Applicative ((<$>))
import Data.List           (sortBy)
import Data.Ord            (comparing)

getFileNameAndSize :: FilePath -> IO (FilePath, Integer)
getFileNameAndSize fname = ((,) fname) <$> withFile fname ReadMode hFileSize

getFilesWithSizes :: FilePath -> IO [(FilePath,Integer)]
getFilesWithSizes dir = do fileNames <- getDirectoryContents dir
                           files     <- mapM getFileNameAndSize fileNames
                           return $ sortBy (comparing snd) files

这很好，而且可以正常工作。然而，我可能会略微不同地编写它。我的版本可能看起来像这样：

{-# LANGUAGE TupleSections #-}
import System.IO           (FilePath, withFile, IOMode(ReadMode), hFileSize)
import System.Directory    (getDirectoryContents)
import Control.Applicative ((<$>))
import Control.Monad       ((<=<))
import Data.List           (sortBy)
import Data.Ord            (comparing)

preservingF :: Functor f => (a -> f b) -> a -> f (a,b)
preservingF f x = (x,) <$> f x
-- Or liftM2 (<$>) (,), but I am not entirely sure why.

fileSize :: FilePath -> IO Integer
fileSize fname = withFile fname ReadMode hFileSize

getFilesWithSizes :: FilePath -> IO [(FilePath,Integer)]
getFilesWithSizes = return .   sortBy (comparing snd)
                           <=< mapM (preservingF fileSize)
                           <=< getDirectoryContents

(<=<是函数组合运算符 . 的单子化形式。) 首先: 是的，我的版本更长一些。然而，我可能已经在某处定义了 preservingF，使得两个函数长度相等。（如果没有其他用途，我甚至可以内联使用 fileSize。）其次，我更喜欢这个版本，因为它涉及到将我们已经编写过的更简单的纯函数链接在一起。虽然您的版本类似，但我觉得我的版本更加简洁，使事情更清晰。

所以这是对您如何构建这些内容的第一个问题的回答。我个人倾向于将我的IO锁定为尽可能少的函数 - 只有需要直接接触外部世界（例如main和与文件交互的任何内容）的函数才会受到IO。其他所有内容都是普通的纯函数（只有在一般情况下是单子的情况下才是单子的，类似于 preservingF）。然后，我安排了一些东西，以便 main等仅仅是纯函数的组合和链： main 从 IO-land 获取一些值; 然后它调用纯函数来折叠、扭曲和 mutilate 数据; 然后它获取更多的IO值; 然后它操作更多; 等等。思路是尽可能地分离两个领域，以便更具有组成性的非- IO 代码总是自由的，并且黑盒的 IO 只在必要的地方精确执行。

像 <=< 这样的运算符真的有助于以这种风格编写代码，因为它们让您像操作正常函数一样操作与单子值（例如 IO-world）交互的函数。您还应该查看 Control.Applicative 的 function <$> liftedArg1 <*> liftedArg2 <*> ... 表示法，它允许您将普通函数应用于任意数量的单子（实际上是Applicative）参数。这对于摆脱繁琐的 <- 并仅在单子代码上链接纯函数非常有用。

*：我觉得 preservingF，或者至少它的兄弟姐妹 preserving :: (a -> b) -> a -> (a,b) 应该在某个包中，但是我找不到任何一个。