递归方案实现的表达式展开

3

我有一个数据类型表示算术表达式:

data E = Add E E | Mul E E | Var String

我想编写一个扩展函数,将表达式转换为变量积的和(一种括号扩展)。当然要使用递归方案。

我只能考虑到一种“进展和保护”的算法。每一步算法都构造出完全展开的术语,因此无需重新检查。

处理 Mul 让我感到很疯狂,所以我使用了同构类型 [[String]] 并利用已经为我实现的 concatconcatMap

type Poly = [Mono]
type Mono = [String]

mulMonoBy :: Mono -> Poly -> Poly
mulMonoBy x = map (x ++)

mulPoly :: Poly -> Poly -> Poly
mulPoly x = concatMap (flip mulMonoBy x)

然后我只需要使用cata

expandList :: E -> Poly
expandList = cata $ \case
   Var x -> [[x]]
   Add e1 e2 = e1 ++ e2
   Mul e1 e2 = mulPoly e1 e2

并将其转换回去:

fromPoly :: Poly -> Expr
fromPoly = foldr1 Add . map fromMono where
   fromMono = foldr1 Mul . map Var

有更好的方法吗?
更新:有一些混淆。
1.解决方案确实允许多行变量名。"Add (Val "foo" (Mul (Val "foo) (Var "bar")))"是"foo + foo * bar"的表示。我不使用"Val" "xyz"之类的东西来表示"x*y*z"。请注意,由于没有重复变量的标量,因此像"foo * foo * quux"这样的变量是完全允许的。
2.所谓的积和总是指“柯里化”的n元积和。积和总的简明定义是我想要一个没有任何括号的表达式,所有的括号都用关联性和优先级表示。
因此,"(foo * bar + bar) + (foo * bar + bar)"不是总和积,因为中间的"+"是总和。
"(foo * bar + (bar + (foo * bar + bar)))"或对应的左结合版本是正确的答案,尽管我们必须保证结合性始终位于左边或右边。因此,正确的类型为右结合解决方案是:
data Poly = Sum Mono Poly
          | Product Mono

这个与非空列表同构: NonEmpty Poly (注意是 Sum Mono Poly 而不是 Sum Poly Poly)。如果我们允许空和或积,那么我们就得到了我使用的列表列表表示。

  1. 如果你不关心性能,那么乘法似乎只是 liftA2 (++)

我已经在答案中添加了一个额外的部分,以回应您更新中的第二点。 - duplode
我对我的答案进行了进一步的编辑,这次添加了一个摘要,其中包括一个更简单的非空列表解决方案。 - duplode
2个回答

1
我不是递归方案的专家,但由于您似乎正在尝试练习它们,希望您将不会觉得将使用手动递归的解决方案转换为使用递归方案的解决方案太过繁琐。首先我将用混合散文和代码编写,最后再提供完整代码以便更简单地复制/粘贴。仅使用分配律和一些递归代数即可完成,这并不太困难。不过,在开始之前,让我们定义一个更好的结果类型,保证我们只能表示积和的总和:
data Poly term = Sum (Poly term) (Poly term)
               | Product (Mono term) 
               deriving Show

data Mono term = Term term
               | MonoMul (Mono term) (Mono term)
               deriving Show

这样我们就不可能搞砸并意外地产生错误的结果,例如:
(Mul (Var "x") (Add (Var "y") (Var "z")))

现在,让我们编写我们的函数。
expand :: E -> Poly String

首先,对于一个基本情况:扩展一个变量很简单,因为它已经处于乘积和形式。但我们必须稍微转换一下,以适应我们的Poly结果类型:
expand (Var x) = Product (Term x)

接下来,需要注意的是扩展加法很容易:只需扩展两个子表达式,然后将它们相加即可。

expand (Add x y) = Sum (expand x) (expand y)

那么乘法呢?这有点复杂,因为

Product (expand x) (expand y)

类型不匹配:我们不能将多项式相乘,只能将单项式相乘。但是我们知道如何进行代数运算,通过分配律将多项式的乘法转化为单项式的乘法之和。就像您的问题一样,我们需要一个名为mulPoly的函数。但让我们假设它已经存在,并稍后实现它。

expand (Mul x y) = mulPoly (expand x) (expand y)

这样就处理了所有情况,现在剩下的就是通过在两个多项式项之间分配乘法来实现mulPoly。我们只需逐项拆分其中一个多项式,将该项乘以另一个多项式中的每一项,然后将结果相加。

mulPoly :: Poly String -> Poly String -> Poly String
mulPoly (Product x) y = mulMonoBy x y
mulPoly (Sum a b) x = Sum (mulPoly a x) (mulPoly b x)

mulMonoBy :: Mono String -> Poly -> Poly
mulMonoBy x (Product y) = Product $ MonoMul x y
mulMonoBy x (Sum a b) = Sum (mulPoly a x') (mulPoly b x')
  where x' = Product x

最终,我们可以测试它是否按照预期工作:
expand (Mul (Add (Var "a") (Var "b")) (Add (Var "y") (Var "z")))
{- results in: Sum (Sum (Product (MonoMul (Term "y") (Term "a"))) 
                        (Product (MonoMul (Term "z") (Term "a")))) 
                   (Sum (Product (MonoMul (Term "y") (Term "b"))) 
                        (Product (MonoMul (Term "z") (Term "b"))))
-}

或者,
(a + b)(y * z) = ay + az + by + bz

我会尽力为您翻译。这段文字的意思是:我们知道这是正确的。下面是完整的解决方案,如上所述。
data E = Add E E | Mul E E | Var String

data Poly term = Sum (Poly term) (Poly term)
               | Product (Mono term) 
               deriving Show

data Mono term = Term term
               | MonoMul (Mono term) (Mono term)
               deriving Show

expand :: E -> Poly String
expand (Var x) = Product (Term x)
expand (Add x y) = Sum (expand x) (expand y)
expand (Mul x y) = mulPoly (expand x) (expand y)

mulPoly :: Poly String -> Poly String -> Poly String
mulPoly (Product x) y = mulMonoBy x y
mulPoly (Sum a b) x = Sum (mulPoly a x) (mulPoly b x)

mulMonoBy :: Mono String -> Poly String -> Poly String
mulMonoBy x (Product y) = Product $ MonoMul x y
mulMonoBy x (Sum a b) = Sum (mulPoly a x') (mulPoly b x')
  where x' = Product x

main = print $ expand (Mul (Add (Var "a") (Var "b")) (Add (Var "y") (Var "z")))

1
这个答案有三个部分。第一部分是摘要,我将介绍我的两个最喜欢的解决方案,这是最重要的部分。第二部分包含类型和导入,以及对解决方案的详细评论。第三部分则关注于重新组合表达式的任务,这是原始版本的答案(即第二部分)没有给予足够重视的地方。
最终,我得出了两个值得讨论的解决方案。第一个是expandDirect(参见第三部分):
expandDirect :: E a -> E a
expandDirect = cata alg
    where
    alg = \case
        Var' s -> Var s
        Add' x y -> apo coalgAdd (Add x y)
        Mul' x y -> (apo coalgAdd' . apo coalgMul) (Mul x y)
    coalgAdd = \case
        Add (Add x x') y -> Add' (Left x) (Right (Add x' y))
        x -> Left <$> project x
    coalgAdd' = \case
        Add (Add x x') y -> Add' (Left x) (Right (Add x' y))
        Add x (Add y y') -> Add' (Left x) (Right (Add y y'))
        x -> Left <$> project x
    coalgMul = \case
        Mul (Add x x') y -> Add' (Right (Mul x y)) (Right (Mul x' y))
        Mul x (Add y y') -> Add' (Right (Mul x y)) (Right (Mul x y'))
        x -> Left <$> project x

使用它,我们从底部(cata)重新构建树。在每个分支上,如果我们发现有无效的内容,我们就会回溯并重写子树(apo),根据需要重新分配和重新关联,直到所有直接子项都正确排列(apo使得不必将所有内容重写到最底部就能实现这一点)。
第二种解决方案expandMeta是第三节中expandFlat的大大简化版本。
expandMeta :: E a -> E a
expandMeta = apo coalg . cata alg
    where
    alg = \case
        Var' s -> pure (Var s)
        Add' x y -> x <> y
        Mul' x y -> Mul <$> x <*> y
    coalg = \case
        x :| [] -> Left <$> project x
        x :| (y:ys) -> Add' (Left x) (Right (y :| ys))

expandMeta 是一种变形;也就是说,它是一个折叠再加上一个展开(虽然我们在这里也使用了 apo,但是 apomorphism 只不过是一种花哨的展开,所以我想术语仍然适用)。折叠将树转换为非空列表——这隐式地处理了 Add 的重新关联——并使用列表应用程序来分配乘法(就像您建议的那样)。而代数结构则相当简单地将非空列表转换回具有适当形状的树。


感谢您的提问 - 我很开心能回答它!准备工作:

{-# LANGUAGE LambdaCase #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE DeriveFoldable #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}

import Data.Functor.Foldable
import qualified Data.List.NonEmpty as N
import Data.List.NonEmpty (NonEmpty(..))
import Data.Semigroup
import Data.Foldable (toList)
import Data.List (nub)
import qualified Data.Map as M
import Data.Map (Map, (!))
import Test.QuickCheck

data E a = Var a | Add (E a) (E a) | Mul (E a) (E a)
    deriving (Eq, Show, Functor, Foldable)

data EF a b = Var' a | Add' b b | Mul' b b
    deriving (Eq, Show, Functor)

type instance Base (E a) = EF a

instance Recursive (E a) where
    project = \case
        Var x -> Var' x
        Add x y -> Add' x y
        Mul x y -> Mul' x y

instance Corecursive (E a) where
    embed = \case
        Var' x -> Var x
        Add' x y -> Add x y
        Mul' x y -> Mul x y

首先,我的第一个工作尝试(虽然有缺陷),使用(非空)列表的适用实例进行分配:
expandTooClever :: E a -> E a
expandTooClever = cata $ \case
    Var' s -> Var s
    Add' x y -> Add x y
    Mul' x y -> foldr1 Add (Mul <$> flatten x <*> flatten y)
    where
    flatten :: E a -> NonEmpty (E a)
    flatten = cata $ \case
        Var' s -> pure (Var s)
        Add' x y -> x <> y
        Mul' x y -> pure (foldr1 Mul (x <> y))

expandTooClever存在一个相对严重的问题:每当它遇到Mul时,它都会调用flatten,即针对两个子树进行全面折叠,这使得它在一系列Mul中具有可怕的渐近复杂度。

暴力、最简单的解决方案是使用递归调用自身的代数运算:

expandBrute :: E a -> E a
expandBrute = cata alg
    where
    alg = \case
        Var' s -> Var s
        Add' x y -> Add x y
        Mul' (Add x x') y -> Add (alg (Mul' x y)) (alg (Mul' x' y))
        Mul' x (Add y y') -> Add (alg (Mul' x y)) (alg (Mul' x y'))
        Mul' x y -> Mul x y

递归调用是必要的,因为分配可能在 Mul 下引入新的 Add 发生。稍微优雅一些的 expandBrute 变体,将递归调用分解为单独的函数。
expandNotSoBrute :: E a -> E a
expandNotSoBrute = cata alg
    where
    alg = \case
        Var' s -> Var s
        Add' x y -> Add x y
        Mul' x y -> dis x y
    dis (Add x x') y = Add (dis x y) (dis x' y)
    dis x (Add y y') = Add (dis x y) (dis x y')
    dis x y = Mul x y

一个驯服的expandNotSoBrute,将dis转化为一个同态。这种表述方式很好地表达了正在发生的大局:如果你只有VarAdd,那么你可以毫不费力地自下而上地重建树;但是,如果你遇到Mul,你就必须返回并重新构建整个子树以执行分配(我想知道是否有一种专门捕捉这种模式的递归方案)。
expandEvert :: E a -> E a
expandEvert = cata alg
    where
    alg :: EF a (E a) -> E a
    alg = \case
        Var' s -> Var s
        Add' x y -> Add x y
        Mul' x y -> apo coalg (x, y)
    coalg :: (E a, E a) -> EF a (Either (E a) (E a, E a))
    coalg (Add x x', y) = Add' (Right (x, y)) (Right (x', y))
    coalg (x, Add y y') = Add' (Right (x, y)) (Right (x, y'))
    coalg (x, y) = Mul' (Left x) (Left y)
apo是必要的,因为我们希望在没有其他分配内容时预测最终结果。(有一种使用ana编写它的方法;但是,这需要在没有更改的情况下浪费地重建Mul树,这会导致与expandTooClever相同的渐近问题。)最后但并非最不重要的,这个解决方案既成功地实现了我尝试使用expandTooClever所做的尝试,又是我对 amalloy的回答的解释。 BT是一棵普通的二叉树,其叶子节点上有值。一个乘积由BT a表示,而乘积和则是树形结构的树。
expandSOP :: E a -> E a
expandSOP = cata algS . fmap (cata algP) . cata algSOP
    where
    algSOP :: EF a (BT (BT a)) -> BT (BT a)
    algSOP = \case
        Var' s -> pure (pure s)
        Add' x y -> x <> y
        Mul' x y -> (<>) <$> x <*> y
    algP :: BTF a (E a) -> E a
    algP = \case
        Leaf' s -> Var s
        Branch' x y -> Mul x y
    algS :: BTF (E a) (E a) -> E a
    algS = \case
        Leaf' x -> x
        Branch' x y -> Add x y

"

BT和它的实例:

"
data BT a = Leaf a | Branch (BT a) (BT a)
    deriving (Eq, Show)

data BTF a b = Leaf' a | Branch' b b
    deriving (Eq, Show, Functor)

type instance Base (BT a) = BTF a

instance Recursive (BT a) where
    project (Leaf s) = Leaf' s
    project (Branch l r) = Branch' l r

instance Corecursive (BT a) where
    embed (Leaf' s) = Leaf s
    embed (Branch' l r) = Branch l r

instance Semigroup (BT a) where
    l <> r = Branch l r

-- Writing this, as opposed to deriving it, for the sake of illustration.
instance Functor BT where
    fmap f = cata $ \case
        Leaf' x -> Leaf (f x)
        Branch' l r -> Branch l r

instance Applicative BT where
    pure x = Leaf x
    u <*> v = ana coalg (u, v)
        where
        coalg = \case
            (Leaf f, Leaf x) -> Leaf' (f x)
            (Leaf f, Branch xl xr) -> Branch' (Leaf f, xl) (Leaf f, xr)
            (Branch fl fr, v) -> Branch' (fl, v) (fr, v)

总的来说,一个测试套件:

newtype TestE = TestE { getTestE :: E Char }
    deriving (Eq, Show)

instance Arbitrary TestE where
    arbitrary = TestE <$> sized genExpr
        where
        genVar = Var <$> choose ('a', 'z')
        genAdd n = Add <$> genSub n <*> genSub n
        genMul n = Mul <$> genSub n <*> genSub n
        genSub n = genExpr (n `div` 2)
        genExpr = \case
            0 -> genVar
            n -> oneof [genVar, genAdd n, genMul n]

data TestRig b = TestRig (Map Char b) (E Char)
    deriving (Show)

instance Arbitrary b => Arbitrary (TestRig b) where
    arbitrary = do
        e <- genExpr
        d <- genDict e
        return (TestRig d e)
        where
        genExpr = getTestE <$> arbitrary
        genDict x = M.fromList . zip (keys x) <$> (infiniteListOf arbitrary)
        keys = nub . toList

unsafeSubst :: Ord a => Map a b -> E a -> E b
unsafeSubst dict = fmap (dict !)

eval :: Num a => E a -> a
eval = cata $ \case
    Var' x -> x
    Add' x y -> x + y
    Mul' x y -> x * y

evalRig :: (E Char -> E Char) -> TestRig Integer -> Integer
evalRig f (TestRig d e) = eval (unsafeSubst d (f e))

mkPropEval :: (E Char -> E Char) -> TestRig Integer -> Bool
mkPropEval f = (==) <$> evalRig id <*> evalRig f

isDistributed :: E a -> Bool
isDistributed = para $ \case
    Add' (_, x) (_, y) -> x && y
    Mul' (Add _ _, _) _ -> False
    Mul' _ (Add _ _, _) -> False
    Mul' (_, x) (_, y) -> x && y
    _ -> True

mkPropDist :: (E Char -> E Char) -> TestE -> Bool
mkPropDist f = isDistributed . f . getTestE

main = mapM_ test
    [ ("expandTooClever" , expandTooClever)
    , ("expandBrute"     , expandBrute)
    , ("expandNotSoBrute", expandNotSoBrute)
    , ("expandEvert"     , expandEvert)
    , ("expandSOP"       , expandSOP)
    ]
    where
    test (header, func) = do
        putStrLn $ "Testing: " ++ header
        putStr "Evaluation test:   "
        quickCheck $ mkPropEval func
        putStr "Distribution test: "
        quickCheck $ mkPropDist func

我所说的“乘积和”是指类似于“柯里化”的n元乘积和。乘积和的简洁定义是,我想要一个没有任何括号的表达式,并且所有括号都由结合律和优先级表示。
我们可以调整上面的解决方案,使得求和重新结合。最简单的方法是将expandSOP中的外部BT替换为NonEmpty。考虑到那里的乘法,就像你建议的那样,是liftA2(<>),这样就可以直接工作了。
expandFlat :: E a -> E a
expandFlat = cata algS . fmap (cata algP) . cata algSOP
    where
    algSOP :: EF a (NonEmpty (BT a)) -> NonEmpty (BT a)
    algSOP = \case
        Var' s -> pure (Leaf s)
        Add' x y -> x <> y
        Mul' x y -> (<>) <$> x <*> y
    algP :: BTF a (E a) -> E a
    algP = \case
        Leaf' s -> Var s
        Branch' x y -> Mul x y
    algS :: NonEmptyF (E a) (E a) -> E a
    algS = \case
        NonEmptyF x Nothing -> x
        NonEmptyF x (Just y) -> Add x y

另一个选择是使用其他任何解决方案,并在单独的步骤中重新关联分布式树中的总和。
flattenSum :: E a -> E a
flattenSum = cata alg
    where
    alg = \case
        Add' x y -> apo coalg (x, y)
        x -> embed x
    coalg = \case
        (Add x x', y) -> Add' (Left x) (Right (x', y))
        (x, y) -> Add' (Left x) (Left y)

我们还可以将flattenSumexpandEvert合并为一个函数。请注意,当求和余代数得到分配余代数的结果时,需要额外考虑一种情况。这是因为当余代数从顶部向底部进行时,我们无法确定它生成的子树是否正确关联。保留HTML标签。
-- This is written in a slightly different style than the previous functions.
expandDirect :: E a -> E a
expandDirect = cata alg
    where
    alg = \case
        Var' s -> Var s
        Add' x y -> apo coalgAdd (Add x y)
        Mul' x y -> (apo coalgAdd' . apo coalgMul) (Mul x y)
    coalgAdd = \case
        Add (Add x x') y -> Add' (Left x) (Right (Add x' y))
        x -> Left <$> project x
    coalgAdd' = \case
        Add (Add x x') y -> Add' (Left x) (Right (Add x' y))
        Add x (Add y y') -> Add' (Left x) (Right (Add y y'))
        x -> Left <$> project x
    coalgMul = \case
        Mul (Add x x') y -> Add' (Right (Mul x y)) (Right (Mul x' y))
        Mul x (Add y y') -> Add' (Right (Mul x y)) (Right (Mul x y'))
        x -> Left <$> project x

也许有更聪明的编写expandDirect的方法,但我还没有想出来。

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接