Haskell 类型和模式匹配问题:从数据类型中提取字段

8

我是一名Haskell新手,正在通过48小时内写出Scheme项目来熟悉语言。在某个实例中,我想要从数据类型中获取底层类型,但不确定如何在不为类型中的每个变体编写转换的情况下实现它。

例如,在以下数据类型中:

data LispVal = Atom String
             | List [LispVal]
             | DottedList [LispVal] LispVal
             | Number Integer
             | String String
             | Bool Bool
             | Double Double

我想写像这样的东西:(我知道这不起作用)

extractLispVal :: LispVal -> a
extractLispVal (a val) = val

甚至更多
extractLispVal :: LispVal -> a
extractLispVal (Double val) = val
extractLispVal (Bool val) = val

这个可行吗?基本上我想要能够从LispVal中转换回基本类型。

谢谢! Simon

3个回答

9
很遗憾,这种构造函数的通用匹配不能直接实现,但即使可以,你的构造函数也行不通——extractLispVal函数没有明确定义的类型,因为结果的类型取决于输入值的值。有各种各样的高级类型系统非常规操作可以完成类似的事情,但在这里使用它们并不是你想要的。
在你的情况下,如果你只对提取特定类型的值感兴趣,或者如果你可以将它们转换为单个类型,你可以编写一个函数,例如extractStringsAndAtoms:: LispVal -> Maybe String。
返回几种可能类型中的一种的唯一方法是将它们组合成数据类型并在其上进行模式匹配——其通用形式为Either a b,它是由构造函数区分的a或b之一。你可以创建一个数据类型,允许提取所有可能的类型...它几乎与LispVal本身相同,所以这并不有用。
如果你真的想在LispVal之外使用各种类型,你还可以查看Data.Data模块,该模块提供了一些反映数据类型的手段。不过我觉得这不是你想要的。
编辑:为了更好地展开,这里有一些你可以编写的提取函数的示例:
  • Create single-constructor extraction functions, as in Don's first example, that assume you already know which constructor was used:

    extractAtom :: LispVal -> String
    extractAtom (Atom a) = a
    

    This will produce runtime errors if applied to something other than the Atom constructor, so be cautious with that. In many cases, though, you know by virtue of being at some point in an algorithm what you've got, so this can be used safely. A simple example would be if you've got a list of LispVals that you've filtered every other constructor out of.

  • Create safe single-constructor extraction functions, which serve as both a "do I have this constructor?" predicate and an "if so, give me the contents" extractor:

    extractAtom :: LispVal -> Maybe String
    extractAtom (Atom a) = Just a
    extractAtom _ = Nothing
    

    Note that this is more flexible than the above, even if you're confident of what constructor you have. For example, it makes defining these easy:

    isAtom :: LispVal -> Bool
    isAtom = isJust . extractAtom
    
    assumeAtom :: LispVal -> String
    assumeAtom x = case extractAtom x of 
                       Just a  -> a
                       Nothing -> error $ "assumeAtom applied to " ++ show x
    
  • Use record syntax when defining the type, as in Don's second example. This is a bit of language magic, for the most part, defines a bunch of partial functions like the first extractAtom above and gives you a fancy syntax for constructing values. You can also reuse names if the result is the same type, e.g. for Atom and String.

    That said, the fancy syntax is more useful for records with many fields, not types with many single-field constructors, and the safe extraction functions above are generally better than ones that produce errors.

  • Getting more abstract, sometimes the most convenient way is actually to have a single, all-purpose deconstruction function:

    extractLispVal :: (String -> r) -> ([LispVal] -> r) -> ([LispVal] -> LispVal -> r) 
                   -> (Integer -> r) -> (String -> r) -> (Bool -> r) -> (Double -> r)
                   -> LispVal -> r
    extractLispVal f _ _ _ _ _ _ (Atom x) = f x
    extractLispVal _ f _ _ _ _ _ (List xs) = f xs
    ...
    

    Yeah, it looks horrendous, I know. An example of this (on a simpler data type) in the standard libraries are the functions maybe and either, which deconstruct the types of the same names. Essentially, this is a function that reifies the pattern matching and lets you work with that more directly. It may be ugly, but you only have to write it once, and it can be useful in some situations. For instance, here's one thing you could do with the above function:

    exprToString :: ([String] -> String) -> ([String] -> String -> String) 
                 -> LispVal -> String
    exprToString f g = extractLispVal id (f . map recur) 
                                      (\xs x -> g (map recur xs) $ recur x)
                                      show show show show
      where recur = exprToString f g
    

    ...i.e., A simple recursive pretty-printing function, parameterized by how to combine the elements of a list. You can also write isAtom and the like easily:

    isAtom = extractLispVal (const True) no (const no) no no no no
      where no = const False
    
  • On the other hand, sometimes what you want to do is match one or two constructors, with nested pattern matches, and a catch-all case for the constructors you don't care about. This is exactly what pattern matching is best at, and all the above techniques would just make things far more complicated. So don't tie yourself to just one approach!


谢谢,这正是我想到的,我只是想确认一下。Haskell的类型系统对我来说仍然有些神秘,所以我认为可能会在这里发生某种魔法 :) - Simon
@Simon:嗯,也许可以。如果你发现你的代码很繁琐或笨拙,并希望使用这样的函数来简化它,我会说很可能有魔法可以使其更加流畅。根据我的经验,在适应Haskell的过程中最大的障碍不是做魔法,而是弄清楚在哪里放置它。 - C. A. McCann
@Simon:我在编辑中添加了一些更详细的示例,以便更好地了解可用的选项。 - C. A. McCann
哇,谢谢你的补充说明。我现在打算避开解构函数 - 它似乎比我需要的要复杂得多,但是安全提取器函数看起来是我原本计划要做的更好的版本。 - Simon
@Simon:看起来比实际复杂,写起来比使用更复杂——但是,是的,使用 Maybe 的“安全提取器”除了直接模式匹配之外,是最好的默认方法。 - C. A. McCann

6

您可以始终从数据类型中提取字段,可以通过对单个构造函数进行模式匹配来实现:

extractLispValDouble (Double val) = val

或使用记录选择器:

data LispVal = Atom { getAtom :: String }
             ...          
             | String { getString :: String }
             | Bool   { getBool :: Bool }
             | Double { getDouble :: Double }

然而,您不能简单地编写一个返回字符串、布尔值或双精度浮点数等类型的函数,因为您无法为其编写类型。


谢谢 - 我将只编写单个转换函数。 - Simon

2
你可以使用GADTs来实现大致的需求,虽然这看起来有些可怕,但它确实有效。我强烈怀疑这种方法能带你走多远。
以下是我快速编写的内容,包含了一个不太正确(空格有点多)的“printLispVal”函数- 我写它是为了看看你是否真的能使用我的结构。请注意,在“extractShowableLispVal”函数中提取基本类型的样板代码。我认为,当你开始执行更复杂的操作,比如尝试进行算术运算时,这种方法很快就会遇到麻烦。
{-# LANGUAGE GADTs #-}
data Unknown = Unknown

data LispList where
    Nil :: LispList
    Cons :: LispVal a -> LispList -> LispList

data LispVal t where
    Atom :: String -> LispVal Unknown
    List :: LispList -> LispVal Unknown
    DottedList :: LispList -> LispVal b -> LispVal Unknown
    Number :: Integer -> LispVal Integer
    String :: String -> LispVal String
    Bool   :: Bool -> LispVal Bool
    Double :: Double -> LispVal Double

data Showable s where
    Showable :: Show s => s -> Showable s

extractShowableLispVal :: LispVal a -> Maybe (Showable a)
extractShowableLispVal (Number x) = Just (Showable x)
extractShowableLispVal (String x) = Just (Showable x)
extractShowableLispVal (Bool x) = Just (Showable x)
extractShowableLispVal (Double x) = Just (Showable x)
extractShowableLispVal _ = Nothing

extractBasicLispVal :: LispVal a -> Maybe a
extractBasicLispVal x = case extractShowableLispVal x of
    Just (Showable s) -> Just s
    Nothing -> Nothing

printLispVal :: LispVal a -> IO ()
printLispVal x = case extractShowableLispVal x of    
    Just (Showable s) -> putStr (show s)
    Nothing -> case x of
        Atom a -> putStr a
        List l -> putChar '(' >> printLispListNoOpen (return ()) l
        DottedList l x -> putChar '(' >> printLispListNoOpen (putChar '.' >> printLispVal x) l

printLispListNoOpen finish = worker where
    worker Nil = finish >> putChar ')'
    worker (Cons car cdr) = printLispVal car >> putChar ' ' >> worker cdr

test = List . Cons (Atom "+") . Cons (Number 3) . Cons (String "foo") $ Nil
test2 = DottedList (Cons (Atom "+") . Cons (Number 3) . Cons (String "foo") $ Nil) test
-- printLispVal test prints out (+ 3 "foo" )
-- printLispVal test2 prints out (+ 3 "foo" .(+ 3 "foo" ))

哇,这可能比我在这个特定应用程序中需要的更复杂,但它让我查找了之前不知道的GADTs。虽然有点有趣,谢谢。 - Simon
@Simon - 是的,这感觉就像用大炮打苍蝇一样!在这里搜索的另一个关键词是“HOAS”,尽管我不太清楚它确切的含义是什么;基本思想是在类型级别上进行一些类型检查,而不是在运行时进行。然而,Lisp可能不太适合这种方法。 - yatima2975
作为对后人的追加说明,这是我在回答中提到的“高级类型系统废话”的一种类型,我没有详细解释它,因为...嗯,看看它就知道了。然而,在某些情况下,这是一种有用且强大的方法,所以我很高兴有人接手并进行了演示! - C. A. McCann
@camccann 这并不是_那么_高级,对吧? :-) LispVal 基本上是发明 GADTs 的原因,而 LispList 只是存在量化的 LispVal 列表。好吧,这不是你在 Haskell 的第一门课程中看到的东西,但它几乎没有在 Oleg 规模上注册! - yatima2975
至少在我上次查看时,实际使用GADTs的入门材料很少,而且如果没有对理论方面有牢固的掌握(足以解释(forall a. a -> r) -> r表示存在类型),很容易陷入令人困惑和晦涩的类型错误中。你的例子只值得一毫欧莱格,但是背景知识包含了至少5个厘欧莱格的内容。这是我的经验,因为我曾经用GADTs和类型+基本依赖关系编写过简单的λ-演算求值器。 - C. A. McCann

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接