如何处理非穷尽模式匹配的误报问题?

3

假设您有一套复杂的函数,并且通过了解您的设计,您可以知道某些函数和参数的组合永远不会出现。如果编译器愿意,它实际上可以推断出这一点。

为了更清晰地说明,我们以这个例子为例(不要告诉我使用map,这只是一个例子):

processAll :: [Int] -> [Int]
processAll [] = []
processAll a = let (x, xs) = processOne a in x:processAll xs
    where
        processOne (x:xs) = (x+1,xs)

在这个例子中,非常明显processOne永远不可能使用一个空列表进行调用。通过ghc编译并添加-Wall会发出警告:
Pattern match(es) are non-exhaustive
In an equation for `processOne': Patterns not matched: []

当然,我并不希望一般情况下禁用这样的警告,因为我可能确实错过了其他地方的模式匹配。但是,我原以为ghc可以推断出该模式列表实际上在其域中是穷尽的。

禁用警告的替代解决方案是:

processAll :: [Int] -> [Int]
processAll [] = []
processAll a = let (x, xs) = processOne a in x:processAll xs
    where
        processOne (x:xs) = (x+1,xs)
        processOne _ = error "processor overheat - explosion imminent"

这段代码是多余的(因为processOne []本来就会导致错误),并且很繁琐。

一般情况下,应该如何处理这种情况?是否应该在每个不可能的情况下添加error信息?


在这个特定的例子中,我知道有更好的处理方法,例如通过让调用者匹配模式。所以如果您想要的话,这里有另一个非常简化的从我正在编写的词法分析器中提取出来的例子,您也可以运行它:

import Data.Char (isNumber, isAlpha)
import Control.Monad

data TokenType = ParenOpen          -- (
                | ParenClose        -- )
                | Plus              -- +
                | Number String     -- A number
                | Variable String   -- Anything else
                | End               -- End of the stream
               deriving (Show, Eq)

-- content is the content of a file from a line and column on
type Content = (String, Int, Int)

-- a token is a token and its position as matched by the lexer
type Token = (TokenType, Int, Int)

lexer :: String -> [Token]
lexer = lexAll . (\s -> (s, 1, 1))
    where
        -- make a maybe value based on a Bool
        makeMaybe :: Bool -> a -> Maybe a
        makeMaybe p x = if p then return x else Nothing

        -- advance the content by one, taking care of line and column numbers
        advance :: Content -> Content
        advance (x:xs, l, c) = (xs, l', c')
                            where
                                l' = if x == '\n' then l + 1 else l
                                c' = if x == '\n' then 1 else c + 1

        -- advance the content by n
        advance' n content = iterate advance content !! n

        -- match a single character
        matchExact :: Char -> Content -> Maybe Content
        matchExact y content@(x:_, _, _) = makeMaybe (x == y) $ advance content

        -- match while pattern holds for characters
        matchPattern :: (Char -> Bool) -> Content -> Maybe (String, Content)
        matchPattern p content@(xs, _, _) = makeMaybe (len > 0) (pfx, advance' len content)
                                    where
                                        pfx = takeWhile p xs
                                        len = length pfx

        matchParenOpen = matchExact '(' >=> (\c -> return (ParenOpen, c))
        matchParenClose = matchExact ')' >=> (\c -> return (ParenClose, c))
        matchPlus = matchExact '+' >=> (\c -> return (Plus, c))
        matchNumber = matchPattern isNumber >=> (\(s, c) -> return (Number s, c))
        matchVariable = matchPattern isAlpha >=> (\(s, c) -> return (Variable s, c))

        lexOne :: Content -> (Token, Content)
        lexOne cur@([], l, c) = ((End, l, c), cur)
        lexOne cur@(_, l, c) = let tokenMatchers = [matchParenOpen,
                                                      matchParenClose,
                                                      matchPlus,
                                                      matchNumber,
                                                      matchVariable
                                                     ] in
                                case msum $ map ($ cur) tokenMatchers of
                                    -- if nothing could be matched, generate an error and skip the character
                                    Nothing -> lexOne $ advance cur
                                    -- otherwise, this is an interesting token
                                    Just (t, cnt) -> ((t, l, c), cnt)

        lexAll :: Content -> [Token]
        lexAll ([], _, _) = []
        lexAll content = token:lexAll rest
                    where
                        (token, rest) = lexOne content

main :: IO ()
main = getContents >>= putStrLn . unlines . map (\(t, l, c) -> show l ++ ":" ++ show c ++ ": " ++ show t) . lexer

在上面的示例中,lexOne 确保没有任何 match* 函数,因此也就没有任何 advance* 函数被传递一个空字符串的 Contentghc 警告:
Pattern match(es) are non-exhaustive
In an equation for `advance': Patterns not matched: ([], _, _)

Pattern match(es) are non-exhaustive
In an equation for `matchExact': Patterns not matched: _ ([], _, _)

我可以确定这种情况永远不会发生。处理这个问题的正确方法是什么?

3
将cons构造器展开,使其再次成为total函数。 processOne x xs = (x+1,xs) 通常应尽可能使所有函数在它们的类型上是total函数——即使它们的传统定义域与形式定义域相同。 - luqui
@luqui,总的来说,那是很好的建议,我同意。但是如果你快速查看我的实际例子,你会发现这可能会让事情有些尴尬。很多只需要一个“Content”的函数,然后就必须要带上“head tail(line,column)”。基本上,我定义了“Content”,以便代码更加清晰明了。 - Shahbaz
2个回答

3
为什么不为NonEmptyContent添加一个类型?
module SO24967745 where
import Control.Monad
import Data.Char

data TokenType = ParenOpen          -- (
                | ParenClose        -- )
                | Plus              -- +
                | Number String     -- A number
                | Variable String   -- Anything else
                | End               -- End of the stream
               deriving (Show, Eq)

-- content is the content of a file from a line and column on
type Content = (String, Int, Int)
type NonEmptyContent = (Char, String, Int, Int)

-- a token is a token and its position as matched by the lexer
type Token = (TokenType, Int, Int)

lexer :: String -> [Token]
lexer = lexAll . (\s -> (s, 1, 1))
    where
        -- make a maybe value based on a Bool
        makeMaybe :: Bool -> a -> Maybe a
        makeMaybe p x = if p then return x else Nothing

        toNonEmptyContent :: Content -> Maybe NonEmptyContent
        toNonEmptyContent ([], _, _) = Nothing
        toNonEmptyContent (x:xs,l,c) = Just (x,xs,l,c)

        toContent :: NonEmptyContent -> Content
        toContent (x, xs, l, c) = (x:xs, l, c)

        -- advance the content by one, taking care of line and column numbers
        advance :: NonEmptyContent -> Content
        advance (x, xs, l, c) = (xs, l', c')
                            where
                                l' = if x == '\n' then l + 1 else l
                                c' = if x == '\n' then 1 else c + 1

        -- advance the content by n
        advance' :: Int -> NonEmptyContent -> Maybe Content
        advance' n = foldr (>=>) Just (replicate n (fmap advance . toNonEmptyContent)) . toContent

        -- match a single character
        matchExact :: Char -> NonEmptyContent -> Maybe Content
        matchExact y content@(x,_, _, _) = makeMaybe (x == y) $ advance content

        -- match while pattern holds for characters
        matchPattern :: (Char -> Bool) -> NonEmptyContent -> Maybe (String, Content)
        matchPattern p content@(x,xs, _, _) = do
          let pfx = takeWhile p (x:xs)
              len = length pfx
          guard (len > 0) 
          content' <- advance' len content
          return (pfx, content')

        matchParenOpen = matchExact '(' >=> (\c -> return (ParenOpen, c))
        matchParenClose = matchExact ')' >=> (\c -> return (ParenClose, c))
        matchPlus = matchExact '+' >=> (\c -> return (Plus, c))
        matchNumber = matchPattern isNumber >=> (\(s, c) -> return (Number s, c))
        matchVariable = matchPattern isAlpha >=> (\(s, c) -> return (Variable s, c))

        lexOne :: Content -> (Token, Content)
        lexOne cur@([], l, c) = ((End, l, c), cur)
        lexOne (x:xs, l, c)   = let cur = (x,xs,l,c)
                                    tokenMatchers = [matchParenOpen,
                                                      matchParenClose,
                                                      matchPlus,
                                                      matchNumber,
                                                      matchVariable
                                                     ] in
                                case msum $ map ($ cur) tokenMatchers of
                                    -- if nothing could be matched, generate an error and skip the character
                                    Nothing -> lexOne $ advance cur
                                    -- otherwise, this is an interesting token
                                    Just (t, cnt) -> ((t, l, c), cnt)

        lexAll :: Content -> [Token]
        lexAll ([], _, _) = []
        lexAll content = token:lexAll rest
                    where
                        (token, rest) = lexOne content

main :: IO ()
main = getContents >>= putStrLn . unlines . map (\(t, l, c) -> show l ++ ":" ++ show c ++ ": " ++ show t) . lexer

这看起来很好,也融合了luqui的建议。谢谢 :) - Shahbaz

2

即使警告确实是误报,您也可以将其视为提示,表明您的代码并不完全清晰,并将其作为机会编写更加清晰的代码。例如:

processAll :: [Int] -> [Int]
processAll [] = []
processAll (a:as) = let (x, xs) = processOne a as in x:processAll xs
where
    processOne x xs = (x+1,xs)

好处:外部函数中有一个规范的、完整的列表模式集合。而内部函数反映了至少需要一个类型为a的值。

从类型上看,内部函数的类型现在是

 a -> b -> (a,b)

替代

 [a] -> (a, [a])

显然,仅有这种类型就表明你的之前版本不完全。

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接