Created
January 6, 2022 05:19
-
-
Save gelisam/176181aacda7f41fa643b50965ff2424 to your computer and use it in GitHub Desktop.
Defining a recursive datatype by its differences with another recursive datatype
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
-- | In response to https://twitter.com/_gilmi/status/1478846678409035779 | |
-- | |
-- The challenge is three-fold: | |
-- | |
-- 1. define the type 'Expr2' as "the same as 'Expr1' but with this constructor | |
-- instead of that one", without having to repeat all the other constructors. | |
-- 2. convert from 'Expr1' to 'Expr2' by only providing a translation for the | |
-- one constructor which is different. | |
-- 3. write a polymorphic function which works with both 'Expr1' and 'Expr2' | |
-- because it doesn't touch the constructor which differs. | |
-- | |
-- As you can guess from the list of language extensions, this is going to be a | |
-- wild ride. | |
{-# LANGUAGE AllowAmbiguousTypes #-} | |
{-# LANGUAGE DataKinds #-} | |
{-# LANGUAGE DeriveFunctor #-} | |
{-# LANGUAGE FlexibleContexts #-} | |
{-# LANGUAGE FlexibleInstances #-} | |
{-# LANGUAGE GADTs #-} | |
{-# LANGUAGE MultiParamTypeClasses #-} | |
{-# LANGUAGE PolyKinds #-} | |
{-# LANGUAGE ScopedTypeVariables #-} | |
{-# LANGUAGE TypeApplications #-} | |
{-# LANGUAGE TypeFamilies #-} | |
{-# LANGUAGE TypeOperators #-} | |
{-# LANGUAGE UndecidableInstances #-} | |
-- Before I dive into the dark magic which makes this possible, let me show the | |
-- end result. | |
-- 1. define the type 'Expr2' as "the same as 'Expr1' but with this constructor | |
-- instead of that one", without having to repeat all the other constructors. | |
data SingleLamF r = SingleLam String r deriving Functor | |
data MultiLamF r = MultiLam [String] r deriving Functor | |
data AppF r = App r [r] deriving Functor | |
type Expr1 = Fix '[SingleLamF, AppF] | |
type Expr2 = ReplaceCtor SingleLamF MultiLamF Expr1 | |
-- Obviously, this is a very unconventional way to define 'Expr1' and 'Expr2', | |
-- but the result is morally equivalent to this: | |
-- | |
-- > data Expr1 | |
-- > = SingleLam String Expr1 | |
-- > | App Expr1 [Expr1] | |
-- > | |
-- > data Expr2 | |
-- > = MultiLam [String] Expr2 | |
-- > | App Expr2 [Expr2] | |
-- 2. convert from 'Expr1' to 'Expr2' by only providing a translation for the | |
-- one constructor which is different. | |
oneToTwo | |
:: Expr1 -> Expr2 | |
oneToTwo | |
= replaceCtor @SingleLamF @MultiLamF $ \(SingleLam x e) | |
-> MultiLam [x] e | |
-- Pretty self-explanatory: we convert from 'Expr1' to 'Expr2' by converting | |
-- the 'SingleLam' constructor to the 'MultiLam' constructor. | |
-- 3. write a polymorphic function which works with both 'Expr1' and 'Expr2' | |
-- because it doesn't touch the constructor which differs. | |
reverseArgs | |
:: HasCtor fix AppF | |
=> fix -> fix | |
reverseArgs | |
= editCtor @AppF $ \(App e args) | |
-> App e (reverse args) | |
-- Pretty self-explanatory: this polymorphic function works with any of our | |
-- fancily-defined @Fix '[...]@ datatypes, as long as that list contains the | |
-- constructor 'AppF'. | |
-- Let's demonstrate that it works with both 'Expr1' and 'Expr2': | |
reverseArgs1 | |
:: Expr1 -> Expr1 | |
reverseArgs1 | |
= reverseArgs | |
reverseArgs2 | |
:: Expr2 -> Expr2 | |
reverseArgs2 | |
= reverseArgs | |
-- All right, now that we're suitably impressed, let's take a look under the | |
-- hood. | |
-- First, note that even though the 'App' constructor is ostensibly unchanged, | |
-- @App Expr1 [Expr1]@ is actually different from @App Expr2 [Expr2]@ in that | |
-- the recursive type is different. We thus need to parameterize our | |
-- constructors by the type 'r' of the recursive datatype. This is the same | |
-- idea as a "base functor" in recursion-schemes. | |
-- | |
-- > data AppF r = App r [r] deriving Functor | |
-- Just like in recursion-schemes, we can then turn a base functor into a | |
-- recursive datatype by defining a 'Fix' datatype which replaces that 'r' with | |
-- itself. | |
-- | |
-- > newtype Fix f = Fix | |
-- > { unFix :: f (Fix f) } | |
-- But wait! We don't want a recursive datatype which only has 'App' as its | |
-- sole constructor, we want to allow any constructor from a given list. We | |
-- thus use an extensible sum, 'CoRec', to turn our list of functors into a | |
-- single functor. | |
newtype Fix ctors = Fix | |
{ unFix :: CoRec ctors (Fix ctors) | |
} | |
-- @CoRec '[f, g, h] a@ is isomorphic to @Either (f a) (Either (g a) (h a))@. | |
data CoRec ctors r where | |
Here | |
:: ctor r -> CoRec (ctor ': ctors) r | |
There | |
:: CoRec ctors r -> CoRec (ctor ': ctors) r | |
-- Now that our types are defined via a type-level list of constructors, | |
-- defining one type in terms of another type is pretty straightforward: just | |
-- write a type family which transforms one list into a different list. | |
type family ReplaceCtor needle replacement fixCtors where | |
ReplaceCtor needle replacement (Fix ctors) | |
= Fix (Replace (Find needle ctors) replacement ctors) | |
type family Find (needle :: k) | |
(xs :: [k]) | |
:: Nat | |
where | |
Find needle (needle ': xs) | |
= 'Z | |
Find needle (x ': xs) | |
= 'S (Find needle xs) | |
type family Replace (i :: Nat) | |
(replacement :: k) | |
(xs :: [k]) | |
:: [k] | |
where | |
Replace 'Z replacement (_ ': xs) | |
= replacement ': xs | |
Replace ('S i) replacement (x ': xs) | |
= x ': Replace i replacement xs | |
-- Note that we're using an inductively-defined 'Nat', we're not using the | |
-- builtin type-level 'Nat' from "GHC.TypeLits". This is because I'll write | |
-- instances for @Z@ and @S i@ in a moment, and writing instances for @0@ and | |
-- @i + 1@ is more difficult because @i + 1@ is an expression. | |
data Nat | |
= Z | |
| S Nat | |
-- The instances in question are for 'replaceCtor', which has a pretty long | |
-- type I'll explain in a moment. | |
replaceCtor | |
:: forall needle replacement fix1 fix2 ctors1 i | |
. ( fix1 ~ Fix ctors1 | |
, fix2 ~ ReplaceCtor needle replacement fix1 | |
, i ~ Find needle ctors1 | |
, ReplaceCtorImpl i needle replacement ctors1 fix1 fix2 | |
) | |
=> (needle fix2 -> replacement fix2) | |
-> fix1 -> fix2 | |
replaceCtor fNeedle | |
= fFix | |
where | |
fFix :: fix1 -> fix2 | |
fFix (Fix corec1) | |
= Fix (replaceCtorImpl @i @needle @replacement fFix fNeedle corec1) | |
-- Recall that we called 'replaceCtor' as | |
-- | |
-- > replaceCtor @SingleLamF @MultiLamF ... | |
-- | |
-- Thus the first two type variables at the beginning of the 'forall', 'needle' | |
-- and 'replacement', are respectively the constructor we're replacing and the | |
-- constructor we're replacing it with. If we knew that the only other | |
-- constructor was 'AppF', the type of 'replaceCtor' would be a lot simpler: | |
-- | |
-- > replaceCtor | |
-- > :: forall needle replacement | |
-- > . (needle (Fix '[needle, App]) -> replacement (Fix '[replacement, App])) | |
-- > -> Fix '[needle, App] -> Fix '[replacement, App] | |
-- | |
-- which can be further simplified by introducing temporary type synonyms: | |
-- | |
-- > replaceCtor | |
-- > :: forall needle replacement fix1 fix2 | |
-- > . ( fix1 ~ Fix '[needle, App] | |
-- > , fix2 ~ Fix '[replacement, App] | |
-- > ) | |
-- > => (needle fix1 -> replacement fix2) | |
-- > -> fix1 -> fix2 | |
-- | |
-- Of course, we want to support a lot more constructor lists than that; | |
-- namely, any two lists 'ctors1' and 'ctors2' which differ only at one | |
-- location 'i', where 'ctors1' contains 'needle' at that index and 'ctors2' | |
-- contains 'replacement' instead. We can already express this using our | |
-- existing type families, without even having to name 'ctors2'. | |
-- | |
-- > replaceCtor | |
-- > :: forall needle replacement fix1 fix2 ctors1 | |
-- > . ( fix1 ~ Fix ctors1 | |
-- > , fix2 ~ ReplaceCtor needle replacement fix1 | |
-- > ) | |
-- > => (needle fix2 -> replacement fix2) | |
-- > -> fix1 -> fix2 | |
-- | |
-- Finally, the last two constraints are needed by the implementation. | |
-- | |
-- > replaceCtor | |
-- > :: forall needle replacement fix1 fix2 ctors1 i | |
-- > . ( ... | |
-- > , i ~ Find needle ctors1 | |
-- > , ReplaceCtorImpl i needle replacement ctors1 fix1 fix2 | |
-- > ) ... | |
-- | |
-- 'i' is again a temporary type synonym, while 'ReplaceCtorImpl' is a | |
-- typeclass which recursively constructs an implementation by using two | |
-- non-overlapping instances to match on the 'Z' and 'S' constructors of 'i'. | |
-- Let's look at that 'ReplaceCtorImpl' typeclass next. It it ostensibly a | |
-- multi-parameter class, but in practice it is only the 'i' parameter which is | |
-- used to select the instance. | |
class ReplaceCtorImpl i needle replacement ctors1 fix1 fix2 where | |
replaceCtorImpl | |
:: ( i ~ Find needle ctors1 | |
, ctors2 ~ Replace i replacement ctors1 | |
) | |
=> (fix1 -> fix2) | |
-> (needle fix2 -> replacement fix2) | |
-> CoRec ctors1 fix1 | |
-> CoRec ctors2 fix2 | |
-- The method 'replaceCtorImpl' again defines temporary type synonyms, 'i' and | |
-- 'ctors2'. It takes a @fix1 -> fix2@ function for recursively transforming | |
-- the subtrees (it is 'replaceCtor' which ties the knot by defining 'fFix' in | |
-- terms of 'fFix'), and a function @needle fix2 -> replacement fix2@ to | |
-- replace the constructor after its recursive occurrences of 'fix1' have | |
-- already been replaced by 'fix2'. | |
-- There are two instances of 'ReplaceCtorImpl' one for 'Z' and one for 'S'. | |
instance ( ctors1 ~ (needle ': xs) | |
, Functor needle | |
) | |
=> ReplaceCtorImpl 'Z needle replacement ctors1 fix1 fix2 | |
where | |
replaceCtorImpl fFix fNeedle (Here needle) | |
= Here $ fNeedle $ fmap fFix $ needle | |
instance ( ctors1 ~ (x ': xs) | |
, Functor x | |
, i ~ Find needle xs | |
, ReplaceCtorImpl i needle replacement xs fix1 fix2 | |
) | |
=> ReplaceCtorImpl ('S i) needle replacement ctors1 fix1 fix2 | |
where | |
replaceCtorImpl fFix _fNeedle (Here x) | |
= Here $ fmap fFix $ x | |
replaceCtorImpl fFix fNeedle (There y) | |
= There $ replaceCtorImpl @i @needle @replacement fFix fNeedle y | |
-- The body of the 'replaceCtorImpl' implementations are fairly | |
-- straightforward: 'fix1' and 'needle' are converted to 'fix2' and | |
-- 'replacement' as needed, and we recur on a smaller list of constructors. | |
-- It's the long list of constraints which is less straightforward. | |
-- | |
-- Most of those constraints repeat the implementation of 'Find' and 'Replace', | |
-- and are guaranteed to hold because of the way in which we have computed 'i'. | |
-- It's just not obvious to ghc that they are guaranteed to hold, so we have to | |
-- write them down explicitly, but they will be discharged automatically once | |
-- ghc is given concrete constructor lists, because ghc will then be able to | |
-- run the type families and confirm that the equalities hold. See [1] for more | |
-- details about this limitation, and a very long list of workarounds. | |
-- | |
-- The only constraints which are not guaranteed to hold are the Functor | |
-- constraints. Thus, in practice, a 'ReplaceCtorImpl' constraint asks that all | |
-- the base functors in the list of constructors have a Functor instance, which | |
-- seems reasonable. | |
-- | |
-- [1] https://github.com/gelisam/typelevel-rewrite-rules#the-problem | |
-- Finally, let's look at 'editCtor'. This one is really easy: | |
-- | |
-- > editCtor @App | |
-- | |
-- is simply | |
-- | |
-- > replaceCtor @App @App | |
-- | |
-- But in order to hide 'replaceCtor''s pile of constraints from the API, I use | |
-- a variant of the "constraint synonym trick": | |
-- | |
-- > class (Eq a, Ord a) => EqAndOrd a | |
-- > instance (Eq a, Ord a) => EqAndOrd a | |
-- | |
-- Except that I only need the synonym to go from 'HasCtor' to its definition, | |
-- not the other way around, so I only need to put the constraints on the | |
-- instance. | |
class HasCtor fix needle where | |
editCtorImpl | |
:: (needle fix -> needle fix) | |
-> fix -> fix | |
instance ( fix ~ Fix ctors | |
, i ~ Find needle ctors | |
, Replace i needle ctors ~ ctors | |
, ReplaceCtorImpl i needle needle ctors fix fix | |
) | |
=> HasCtor fix needle | |
where | |
editCtorImpl | |
= replaceCtor | |
editCtor | |
:: forall needle fix | |
. HasCtor fix needle | |
=> (needle fix -> needle fix) | |
-> fix -> fix | |
editCtor | |
= editCtorImpl @fix @needle |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment