CayMon tutorial

In this tutorial, we discuss the theory behind the CayMon tool. Not to get too mathy (for that, read the FOSSACS'19 paper), we shamelessly intermingle the world of sets and functions with Haskell types and Haskell functions, so semantic purists are kindly asked to keep their cool.

Where do monads come from?

This is not a monad tutorial, so we assume that the reader is familiar with the concept, at least as it is used in programming. In Haskell, monads are defined as a type class. The one we show below is the one from Haskell's standard library stripped off some Haskelly noise (such as the Applicative constraint):

class Monad m where
  (>>=)   :: m a -> (a -> m b) -> m b
  return  :: a -> m a

Different flavours of monads are closely related to different flavours of algebraic theories. For example, a famous result by Fred Linton says that finitary equational theories are equivalent to finitary monads on the category Set. For example, the list monad

data [a] = a : [a] | []

is closely related to the theory of monoids:

Operations: +, 0
Equations:  (x+y)+z == x+(y+z),  x+0 == x,  0+x == x

The relationship is this: for a set A, the free monoid generated by A is given by [A], while the freeness gives the implementation of >>= and return (for a super-quick introduction, see here). The + operation is implemented as ++, while 0 as the empty list [].

The same story works for the Maybe monad:

data Maybe a = Just a | Nothing

It arises from the theory of pointed sets:

Operations: ! (nullary)
Equations:  (none)

The free pointed set generated by a set A is indeed Maybe A, while the ! operation is obviously implemented as Nothing.

There are many more examples. What do we gain by understanding the equational theory behind a monad? We get some operations that we can use in programming and we get to know their equational properties that we can use to reason about our programs (check out this paper for a detailed explanation and many more examples).

Monads from equational theories... in Set

If we consider sets and functions (as in mathematics), the trick with free algebras of (finitary or otherwise well-behaved) equational theories always produces a monad. So, if you want a particular effect, just write down a bunch of operations and equations and voilà, a monad is born:

Grandma's recipe for monads
  1. Start with an equational theory (a bunch of operations and equations).
  2. For a set A, take the set of terms over the signature with variables coming from A.
  3. Take the smallest congruence relation that includes the equations.
  4. Take the quotients of the set of terms by the congruene. This is your monad M A.

    Ok, it's fine if you happen to live in the perfect platonic world of sets and functions. But as Haskell programmers, we are not happy with the recipe. We can define terms over a signature (using free monads), but we cannot quotient! For example, free monoids can be expressed as lists, but throw in inverses (to get groups) and try to implement the free group monad. Not a chance!

    So, which equational theories induce implementable monads? Disappointingly, we will never know in general, because this problem is undecidable, I think. We can, however, try to better understand some particular cases or some broad class of particular cases.

    Monads from Cayley representations

    The class of particular cases that we want to look at comes from the following theorem:

    THEOREM If an equational theory has a well-behaved Cayley representation Rep, the monad induced by the theory can be implemented as the type

    data CayMon a = C (forall x. (a -> Rep x) -> Rep x

    The name CayMon stands for "Cayley monad". We do not expand on what we mean by "well-behaved", because we promised not to be too mathy in this tutorial.

    But what is this Cayley representation? First, let us discuss the particular case for monoids (Arthur Cayley was more into groups, but we, Haskell programmers, cannot afford inverses). The theory of monoids can be encoded as a type class with a conventional set of equations:

    class Monoid a where
      unit  :: a
      comp  :: a -> a -> a
    
    -- Instances should satisfy the following laws:
    -- comp unit x = x
    -- comp x unit = x
    -- comp (comp x y) z = comp x (comp y z)

    There is one particularly fancy monoid, or rather a family of monoids:

    data Rep v = Rep (v -> v)
    
    instance Monoid (Rep a) where
      unit                  = Rep (\x -> x)
      comp (Rep f) (Rep g)  = Rep (\x -> f (g x))

    The super-cool thing is that we can encode the behaviour of every monoid in this one:

    toCayley :: (Monoid a) => a -> Rep a
    toCayley x = Rep (\y -> comp x y)
    
    fromCayley :: Monoid a => Rep a -> a
    fromCayley (Rep f) = f unit
    It is the case that toCayley is a monoid homomorphism and that fromCayley . toCayley = id. These two facts (+ some additional conditions that make the representation "well-behaved") are enough for the theorem above to apply.

    Evidently, Rep is implementable in Haskell, so the theorem gives us an implementation of the monad induced by the theory of monoids. (BTW, if we inline the definition of Rep in the definition of CayMon, we get forall x. (a -> x -> x) -> x -> x, the Church representation of lists, as noticed by Ralf Hinze).

    The idea is to generalise this to different theories and settings. For example, type Rep x = S -> x turns out to represent state (where S is a type of possible states), while type Rep x = (S -> x) -> x represents backtracking with one global mutable cell.

    CayMon's recipe for monads
    1. Start with an equational theory (a bunch of operations and equations).
    2. Come up with a well-behaved Cayley representation of the theory given by a type constructor Rep.
    3. Your monad for a type A is given by the universally quantified type forall x. (A -> Rep x) -> Rep x.

      A careful reader might have noticed that the shape of the monad resembles the codensity monad. One important difference is that Rep is a type constructor, but not necessarily a functor - however, the implementation of the monadic structure (return and >>=) is actually identical in the continuation, codensity, and Cayley monads.

      Are we in a better place now?

      The task of coming up with an implementable representation for a given theory is not an easy one. For sure you cannot mechanise it, again, for complexity reasons. Our strategy is to reverse the question: given a type constructor Rep, produce a theory that is represented by Rep. We will not be able to see every such theory, just as we cannot recognise every theory that gives nondeterminism or state - people seem to be happy with the familiar ones: monoids and put and get with equations respectively.

      Thus, the overall goal is to explore the space of types, and see what kind of effects are implementable by exploring possible implementations. The space of types is more structured than the space of all equational theories, which means that we can recognise some more or less canonical classes. For now, we have something to say about polynomials with natural coefficients, that is, types of the form

      type Rep x = P x -> x

      where P x is any type formed from products, sums, the variable x, and natural numbers. A number n is understood as a type with n inhabitants, e.g., 0 is Void, 1 is () (unit), 2 is Bool, and so on.

      Enter CayMon

      WARNING: CayMon is a research-level exploratory software. Reading its source code might help you understand some details, but its overall internal chaos will give you a headache. Read source code responsibly!