如果你手头唯一的工具是锤子,那么你可能会把所有东西都当成钉子。
亚伯拉罕·马斯洛。
那么我们来考虑一些不同的东西——一个非 Monad
类型的独特供应。恰好情况如此,您的原始类型签名已经接近了:
compileStatement :: Statement -> UniqueIDState -> [AbstractInstruction]
如果唯一的要求是每个标签都是唯一的 - 不需要计算使用了多少个,只要在相同情况下给出相同的标识符即可 - 那么您可以使用一种不太侵入性的技术。
在
无IO的可分割供应品 中,Luke Palmer 展示了如何将
价值供应品 进行封装。
runSupply :: (forall a . Eq a => Supply a -> b) -> b
这样可以避免单子类型
IO
污染使用它们的程序的大部分:
不错! 但这并不是唯一的问题 - 根据它们如何定义,你需要正确地使用这些供应品。例如,假设:
data Statement =
... | If Statement Statement Statement | ...
然后如果:
compileStatement (If c t e) s =
case split s of
s1 : s2 : s3 : _ -> buildCondJump (compileStatement c s1)
(compileStatement t s2)
(compileStatement e s3)
被错误地更改为:
compileStatement (If c t e) s =
case split s of
s1 : s2 : s3 : _ -> buildCondJump (compileStatement c s)
(compileStatement t s)
(compileStatement e s)
不仅UniqueSupply
和Unique
的值被错误地重复使用,如果compileStatement
的任何递归调用密集使用了supply,则存在空间泄漏的风险。
与
Clean不同,Haskell没有标准的方法将类型标记为单态(monousal)。这就只能在运行时进行检查:绝对是一个抽象数据类型的工作!
有个想法——如果该ADT也是可分割的(splittable),我们就可以使用它来定义一个替代的值供应类型。如果一切顺利,这种新类型的值将具有两个属性:可分割(splittable)和单态(monousal)。
查看
Data.Supply
可以发现使用了二叉树类型,该模块和定义似乎基于功能珠玑 [On generating unique names] 中的以下示例,由Lennart Augustsson、Mikael Rittri和Dan Synek编写-第4页共7页。
module HideGensym(
Name, NameSupply, initialNameSupply, getNameDeplete, splitNameSupply)
where
gensym :: a -> Int
data Name = MkName Int deriving (Eq)
data NameSupply = MkNameSupply Name NameSupply NameSupply
initialNameSupply = gen ()
where gen x = MkNameSupply (MkName (gensym x)) (gen x) (gen x)
getNameDeplete (MkNameSupply n s1 _) = (n, s1)
splitNameSupply (MkNameSupply _ s1 s2) = (s1, s2)
...gensym
:我们暂且不管它。让我们看看如何将新的 ADT 插入到 NameSupply
中……在我们处理更加平凡的事情之前:- 两个相同的对 gen
的调用 - MkNameSupply ... (gen x) (gen x)
- 对于一个优化的 Haskell 实现,它们是相同的值:
initialNameSupply = gen ()
where gen x = let s = gen x in
MkNameSupply (MkName (gensym x)) s s
也许我们可以一举解决这两个问题:
initialNameSupply = runUO gen
where gen u = let !(u1, u2) = splitUO2 u in
MkNameSupply (MkName (gensym ())) (gen u1) (gen u2)
这里 UO
将成为我们的新的一次性拆分准备好的抽象数据类型:
module UO(
UO, initUO, splitUO, splitUO2, ...
) where
data UO s ...
runUO :: (forall s . UO s -> a) -> a
splitUO :: UO s -> [UO s]
splitUO2 :: UO s -> (UO s, UO s)
⋮
...也可以被封装。
(毫无疑问,在英语中一定有比 spiltable splittable 更好的词语...)
现在让我们来谈谈关于“gensym”问题——让我们从第4页的“HideGensym”模块开始,先发出这样一个警告:
必须使用汇编语言编写gensym
[物品],可能还要编写gen
函数。
否则,那个单独调用gensym
的命令可能会全部移除:还记得gen()
吗?
initialNameSupply = runUO gen
where gen u = let !(u1, u2) = splitUO2 u in
MkNameSupply (MkName x) (gen u1) (gen u2)
x = gensym ()
由于gensym
(据说!)接受任何类型的输入:
gensym :: a -> Int
这不应该破坏任何东西:
initialNameSupply = runUO gen
where gen u = let !(u1:u2:u3:_) = splitUO u in
MkNameSupply (MkName (gensym u1)) (gen u2) (gen u3)
作为额外的奖励,我们可以制作一个稍微更通用的
initialNameSupply
版本:
initialNameSupply = initialSupply gensym
initialSupply :: (UO s -> Int) -> NameSupply
initialSupply g = runUO gen
where gen u = let !(u1:u2:u3:_) = splitUO u in
MkNameSupply (MkName (g u1)) (gen u2) (gen u3)
好的,所以gensym
还在那里-至少现在它被隔离了。
到目前为止,您可能已经注意到另一个示例模块
OneTimeSupplies
,它有自己的警告语:
仅当每个供应品最多使用一次时,它才是引用透明的。
此外,在第3页的后面:
如果程序的编译时分析可以保证每个名称提供只被使用一次,无论是执行getNameDeplete
还是splitNameSupply
,则树就变得不必要了[...]
由于我们依赖
UO
来提供相同的保证,那么我们是否也可以在实现中去掉树并节省一些工作呢?
为了做到这一点,
supplyValue
和
split
将需要升级:
the simplest option for supplyValue
is to provide it with the generator (g
in initialSupply
).
data NameSupply = forall s . Supply (UO s -> Int) ...
supplyValue :: NameSupply -> Name
supplyValue (Supply g ...) = MkName (g ...)
as for the split
s, they require an UO
value so they can obtain the new UO
values needed by the new supplies:
data NameSupply = forall s . Supply (UO s) ...
split :: NameSupply -> [NameSupply]
split (Supply u ...) = [ Supply v ... | v <- splitUO u ]
split2 :: NameSupply -> (NameSupply, NameSupply)
split2 (Supply u ...) = let !(u1, u2) = splitUO2 u in
(Supply u1 ..., Supply u2 ...)
那明显表明:
data NameSupply = forall s . Supply (UO s -> Int) (UO s)
supplyValue (Supply g u) = MkName (g u)
split (Supply g u) = [ Supply g v | v <- splitUO u ]
split2 (Supply g u) = let !(u1, u2) = splitUO2 u in
(Supply g u1, Supply g u2)
但是它也适用于initialNameSupply
吗?
initialNameSupply = initialSupply gensym
initialSupply :: (UO s -> Int) -> NameSupply
initialSupply = runUO . Supply
事情会变得更好:
type NameSupply = Supply Name
data Name = MkName Int deriving (Eq)
initialNameSupply = initialSupply (MkName . gensym)
data Supply a = forall s . Supply (UO s -> a) (UO s)
instance Functor Supply where
fmap f (Supply g u) = Supply (f . g) u
supplyValue :: Supply a -> a
supplyValue (Supply g u) = g u
split :: Supply a -> [Supply a]
split (Supply g u) = [ Supply g v | v <- splitUO u ]
split2 :: Supply a -> (Supply a, Supply a)
split2 (Supply g u) = let !(u1, u2) = splitUO2 u in
(Supply g u1, Supply g u2)
initialSupply :: (UO s -> a) -> NameSupply
initialSupply = runUO . Supply
这非常有前途,如果可以按照预期定义UO
和其关联对象...
如果您已经阅读了Luke Palmer的
帖子,那么您已经知道他使用一个
丑陋的
不安全的
实体来定义
runSupply
。现在(2022年1月),
runST
以类似的方式
定义:
runST :: (forall s. ST s a) -> a
runST (ST st_rep) = case runRW# st_rep of (# _, a #) -> a
在哪里:
newtype ST s a = ST (STRep s a)
type STRep s a = State# s -> (# State# s, a #
runRW# :: STRep RealWorld a -> (# State# RealWorld, a #)
不使用这些方法能定义 UO
吗?这可能值得单独回答 - 现在,我们只能容忍这种丑陋:
{-# LANGUAGE BangPatterns, RankNTypes, UnboxedTuples, MagicHash #-}
module UO(
UO, runUO, splitUO, splitUO2,
useUO, asUO,
) where
import Prelude (String, Eq(..))
import Prelude ((.), ($), (++), error, all)
import Data.Char (isSpace)
import GHC.Base (State#, MutVar#)
import GHC.Base (runRW#, newMutVar#, noDuplicate#)
import GHC.Exts (atomicModifyMutVar#)
import GHC.ST (ST(..), STRep)
data UO s = UO (UO# s)
runUO :: (forall s . UO s -> a) -> a
runUO g = let (# _, r #) = runRW# (useUO# (g . UO)) in r
splitUO :: UO s -> [UO s]
splitUO u = let !(u1, u2) = splitUO2 u in u1 : splitUO u
splitUO2 :: UO s -> (UO s, UO s)
splitUO2 (UO h) = let (# h1, h2 #) = splitUO2# h in (UO h1, UO h2)
useUO :: (UO s -> a) -> ST s a
useUO g = ST (\s -> useUO# (g . UO) s)
asUO :: Eq a => String -> ST s a -> UO s -> a
asUO name (ST act) (UO h)
= asUO# name act h
type UO# s = String -> State# s
splitUO2# :: UO# s -> (# UO# s, UO# s #)
splitUO2# h = let !s = h "splitUO2"
(# s', h1 #) = dispense# s
(# _, h2 #) = dispense# s'
in (# h1, h2 #)
useUO# :: (UO# s -> a) -> STRep s a
useUO# g s = let (# s', h #) = dispense# s
!r = g h
in (# s', r #)
dispense# :: STRep s (UO# s)
dispense# s = let (# s', r #) = newMutVar# () s
in (# s', expire# s' r #)
expire# :: State# s -> MutVar# s () -> String -> State# s
expire# s r name = let (# s', () #) = atomicModifyMutVar# r use s
in s'
where
use x = (error nowUsed, x)
nowUsed = name' ++ ": already expired"
name' = if all isSpace name then "(unknown)"
else name
asUO# :: Eq a => String -> STRep s a -> UO# s -> a
asUO# name act h = let (# _, t #) = act (noDuplicate# (h name)) in t
这可能比严格必要的复杂一点(例如,基本的重用错误报告),但作为交换,UO
基于定义现在可以操作本地状态...
还有一个在Data.Supply
中需要实现的定义:
newSupply :: a -> (a -> a) -> IO (Supply a)
newSupply start next = gen =<< newIORef start
where gen r = unsafeInterleaveIO
$ do v <- unsafeInterleaveIO (atomicModifyIORef r upd)
ls <- gen r
rs <- gen r
return (Node v ls rs)
upd a = let b = next a in seq b (b, a)
这将结束对 gensym
的需求。它与 initialSupply
有些相似 - 能否让这更明显?
gen
in the original initialNameSupply
doesn't have a reference parameter r
:
newSupply start next = do r <- newIORef start
let gen = unsafeInterleaveIO $
do v <- unsafeInterleaveIO (atomicModifyIORef r upd)
ls <- gen
rs <- gen
return (Node v ls rs)
gen
where upd a = let b = next a in seq b (b, a)
the value-action unsafeInterleaveIO (atomicModifyIORef r upd)
performs the role of gensym
in the original initialNameSupply
:
newSupply start next = do r <- newIORef start
let gen = unsafeInterleaveIO $
do v <- genval
ls <- gen
rs <- gen
return (Node v ls rs)
genval = unsafeInterleaveIO (atomicModifyIORef r upd)
gen
where upd a = let b = next a in seq b (b, a)
gen
in the original initialNameSupply
had no need of do
-notation:
newSupply start next = do r <- newIORef start
let gen = unsafeInterleaveIO (liftM3 Node genval gen gen)
genval = unsafeInterleaveIO (atomicModifyIORef r upd)
gen
where upd a = let b = next a in seq b (b, a)
does genval
have to be in that let
-binding?
newSupply start next = do r <- newIORef start
let gen = unsafeInterleaveIO (liftM3 Node (genval r) gen gen)
gen
where genval r = unsafeInterleaveIO (atomicModifyIORef r upd)
upd a = let b = next a in seq b (b, a)
upd
is only used in genval
:
newSupply start next = do r <- newIORef start
let gen = unsafeInterleaveIO (liftM3 Node (genval r) gen gen)
gen
where genval r = let upd a = let b = next a in seq b (b, a)
in unsafeInterleaveIO (atomicModifyIORef r upd)
can some content in genval
be moved to a separate definition?
newSupply start next = do r <- newIORef start
let gen = unsafeInterleaveIO (liftM3 Node (genval r) gen gen)
gen
where genval r = unsafeInterleaveIO (nextValue r next)
nextValue :: IORef a -> (a -> a) -> IO a
nextValue r next = let upd a = let b = next a in seq b (b, a)
in atomicModifyIORef r upd
现在它更清晰地类似于原始的initialNameSupply
,使用我们的新Supply
类型重新实现newSupply
相对简单 - 首先,需要改变单子类型:
newSupply start next = do r <- newSTRef start
let gen = unsafeInterleaveST (liftM3 Node (genval r) gen gen)
gen
where genval r = unsafeInterleaveST (nextValue r next)
nextValue :: STRef s a -> (a -> a) -> ST s a
nextValue r next = let upd a = let b = next a in seq b (b, a)
in atomicModifyST r upd
nextValue
不需要进行其他更改。至于 newSupply
:
newSupply :: Eq a => a -> (a -> a) -> ST s (Supply a)
newSupply start next = do r <- newSTRef start
let g = asUO "genval" (genval r)
useUO (Supply g)
where genval r = nextValue r next
我们可以使用这个函数来定义我们的版本 runSupply
:
runSupply :: (forall a . Eq a => Supply a -> b) -> b
runSupply f = f (runST (newSupply (0 :: Int) succ))
我们现在能否从NameSupply
类型中最终删除gensym
?
initialNameSupply :: NameSupply
initialNameSupply = fmap MkName (initialSupply 0 succ)
initialSupply :: Eq a => a -> (a -> a) -> Supply a
initialSupply start next = runST (newSupply start next)
是的。
这里是所有相关定义,按模块排列:
ExpelGensym
, the replacement for HideGensym
on page 4 of 7:
{-# LANGUAGE BangPatterns #-}
module ExpelGensym(
Name, NameSupply, initialNameSupply, getNameDeplete, splitNameSupply
) where
import Control.Monad (liftM)
import Control.Monad.ST (runST)
import Supply (Supply, newSupply, supplyValue, split2)
data Name = MkName Int deriving (Eq)
type NameSupply = Supply Name
initialNameSupply :: Supply Name
initialNameSupply = fmap MkName (initialSupply 0 succ)
getNameDeplete :: NameSupply -> (Name, NameSupply)
getNameDeplete s = let !(s1, s2) = split2 s in (supplyValue s1, s2)
splitNameSupply :: NameSupply -> (NameSupply, NameSupply)
splitNameSupply = split2
initialSupply :: Eq a => a -> (a -> a) -> Supply a
initialSupply start next = runST (newSupply start next)
Supply
, our miniature implementation of Data.Supply
:
{-# LANGUAGE BangPatterns, ExistentialQuantification, RankNTypes #-}
module Supply(
Supply, newSupply, runSupply, supplyValue, split, split2
) where
import Control.Monad.ST
import Data.STRef
import UO
data Supply a = forall s . Supply (UO s -> a) (UO s)
instance Functor Supply where
fmap f (Supply g u) = Supply (f . g) u
newSupply :: Eq a => a -> (a -> a) -> ST s (Supply a)
newSupply start next = do r <- newSTRef start
let g = asUO "genval" (genval r)
useUO (Supply g)
where genval r = nextValue r next
runSupply :: (forall a . Eq a => Supply a -> b) -> b
runSupply f = f (runST (newSupply (0 :: Int) succ))
supplyValue :: Supply a -> a
supplyValue (Supply g u) = g u
split :: Supply a -> [Supply a]
split (Supply g u) = [ Supply g v | v <- splitUO u ]
split2 :: Supply a -> (Supply a, Supply a)
split2 (Supply g u) = let !(u1, u2) = splitUO2 u in
(Supply g u1, Supply g u2)
nextValue :: STRef s a -> (a -> a) -> ST s a
nextValue r next = let upd a = let b = next a in seq b (b, a)
in atomicModifySTRef r upd
UO
, that use-once split-ready abstract-data type:
{-# LANGUAGE BangPatterns, RankNTypes, UnboxedTuples, MagicHash #-}
module UO(
UO, runUO, splitUO, splitUO2,
useUO, asUO,
) where
import Prelude (String, Eq(..))
import Prelude ((.), ($), (++), error, all)
import Data.Char (isSpace)
import GHC.Base (State#, MutVar#)
import GHC.Base (runRW#, newMutVar#, noDuplicate#)
import GHC.Exts (atomicModifyMutVar#)
import GHC.ST (ST(..), STRep)
data UO s = UO (UO# s)
runUO :: (forall s . UO s -> a) -> a
runUO g = let (# _, r #) = runRW# (useUO# (g . UO)) in r
splitUO :: UO s -> [UO s]
splitUO u = let !(u1, u2) = splitUO2 u in u1 : splitUO u
splitUO2 :: UO s -> (UO s, UO s)
splitUO2 (UO h) = let (# h1, h2 #) = splitUO2# h in (UO h1, UO h2)
useUO :: (UO s -> a) -> ST s a
useUO g = ST (\s -> useUO# (g . UO) s)
asUO :: Eq a => String -> ST s a -> UO s -> a
asUO name (ST act) (UO h)
= asUO# name act h
type UO# s = String -> State# s
splitUO2# :: UO# s -> (# UO# s, UO# s #)
splitUO2# h = let !s = h "splitUO2"
(# s', h1 #) = dispense# s
(# _, h2 #) = dispense# s'
in (# h1, h2 #)
useUO# :: (UO# s -> a) -> STRep s a
useUO# g s = let (# s', h #) = dispense# s
!r = g h
in (# s', r #)
dispense# :: STRep s (UO# s)
dispense# s = let (# s', r #) = newMutVar# () s
in (# s', expire# s' r #)
expire# :: State# s -> MutVar# s () -> String -> State# s
expire# s r name = let (# s', () #) = atomicModifyMutVar# r use s
in s'
where
use x = (error nowUsed, x)
nowUsed = name' ++ ": already expired"
name' = if all isSpace name then "(unknown)"
else name
asUO# :: Eq a => String -> STRep s a -> UO# s -> a
asUO# name act h = let (# _, t #) = act (noDuplicate# (h name)) in t