As promised, I’m finally going to get to the theory behind monads. As a quick review, the basic idea of the monad in Haskell is a hidden transition function – a monad is, basically, a state transition function.
The theory of monads comes from category theory. I’m going to assume you know a little bit about category theory – if you have trouble with it, go take a look at my introductory posts here.
Fundamentally, in category theory a monad is a category with a particular kind of structure. It’s a category with one object. That category has a collection of arrows which (obviously) are from the single object to itself. That one-object category has a functor from the category to itself. (As a reminder, a functoris an arrow between categories in the category of (small) categories.)
The first trick to the monad, in terms of theory, is that it’s fundamentally about the functor: since the functor maps from a category to the same category, so you can almost ignore the category; it’s implicit in the definition of the functor. So we can almost treat the monad as if it were just the functor – that is, a kind of transition function.
The other big trick is closely related to that. For the programming language application of monads, we can think of the single object in the category as the set of all possible states. So we have a category object, which is essentially the collection of all possible states; and there are arrows between the states representing possible state transitions. So the monad’s functor is really just a mapping from arrows to different arrows – which basically represents the way that changing the state causes a change in the possible transitions to other states.
So what a monad gives us, in terms of category theory, in a conceptual framework that captures the concept of a state transition system, in terms of transition functions that invisibly carry a state. When that’s translated into programming languages, that becomes a value that implicitly takes an input state, possibly updates it, and returns an output state. Sound familiar?
Let’s take a moment and get formal. As usual for category theory, first there are some preliminary definitions.
- Given a category, C, 1C is the identity functor from C to C.
- Given a category C with a functor T : C → C, T2 = T º T.
- Given a functor T, 1T : T → T is the natural transformation from T to T.
Now, with that out of the way, we can give the complete formal definition of a monad. Given a category C, a monad on C is a triple: (T:C→C, η:1C→T, μ:T2 → T), where T is a functor, and η and μ are natural transformations. The members of the triple must make the following two diagrams commute.
Commutativity of composition with μ
Commutativity of composition with η
What these two diagrams mean is that successive applications of the state-transition functor over C behave associatively – that any sequence of composing monadic functors will result in a functor with full monadic structure; and that the monadic structure will always preserve. Together, these mean that any sequence of operations (that is, applications of the monad functor) are themselves monad functors – so the building a sequence of monadic state transformers is guaranteed to behave as a proper monadic state transition – so what happens inside of the monadic functor is fine – to the rest of the universe, the difference between a sequence and a single simple operation is indistinguishable: the state will be consistently passed from application to application with the correct chaining behavior, and to the outside world, the entire monadic chain looks like like a single atomic monadic operation.
Now, what does this mean in terms of programming? Each element of a monadic sequence in Haskell is an instantiation of the monadic functor – that is, it’s an arrow between states – a function, not a simple value – which is the basic trick to monads. They look like a sequence of statements; in fact, each statement in a monad is actually a function from state to state. And it looks like we’re writing sequence code – when what we’re actually doing is writing function compositions – so that when we’re done writing a monadic sequence, what we’ve actually done is written a function definition in terms of a sequence of function compositions.
Understanding that, we can now clearly understand why we need the return function to use a non-monad expression inside of a monadic sequence – because each step in the sequence needs to be an instance of the monadic functor; an expression that isn’t an instance of the monadic functor couldn’t be composed with the functions in the sequence. The return function is really nothing but a function that combines a non-monadic expression with the id functor.
In light of this, let’s go back and look at the definition of
Monad in the Haskell standard prelude.
class Functor f where fmap :: (a -> b) -> f a -> f b class Monad m where (>>=) :: m a -> (a -> m b) -> m b (>>) :: m a -> m b -> m b return :: a -> m a fail :: String -> m a -- Minimal complete definition: -- (>>=), return m >> k = m >>= _ -> k fail s = error s
The declaration of monad is connected with the definition of functor – if you look, you can see the connection. The fundamental operation of
Monad is “
>>=” – the chaining operation, which is type
m a -> (a -> m b) -> m b is deeply connected with the
fmap operation from
fmap operation – the
m a is generally going to be a type which can be a
So the value type wrapped in the monad is a functor – in fact, the functor from the category definition! And the “
>>=” operation is just the functor composition operation from the monad definition.
A proper implementation of a monad needs to follow some fundamental rules – the rules are, basically, just Haskell translations of the structure-preserving rules about functors and natural transformations in the category-theoretic monad. There are two groups of laws – laws about the
Functor class, which should hold for the transition function wrapped in the monad class; and laws about the monadic operations in the
Monad class. One important thing to realize about the functor and monad laws is that they are not enforced – in fact, cannot be enforced! – but monad-based code using monad implementations that do not follow them may not work correctly. (A compile-time method for correctly verifying the enforcement of these rules can be shown to be equivalent to the halting problem.)
There are two simple laws for
Functor, and it’s pretty obvious why they’re fundamentally just strucure-preservation requirements. The functor class only has one operation, called
fmap, and the two functor laws are about how it must behave.
fmap id = id
(Mapping ID over any structured sequence results in an unmodified sequence)
fmap (f . g) = (fmap f) . (fmap g)
(“.” is the function composition operation; this just says that fmap preserves the structure to ensure that that mapping is associative with composition.)
The monad laws are a bit harder, but not much. The mainly govern how monadic operations interact with non-monadic operations in terms of the “
return” and “
>>=” operations of the
return x >>= f = f x
(injecting a value into the monad is basically the same as passing it as a parameter down the chain – return is really just the identity functor passing its result on to the next step.)
f >>= return = f
(If you don’t specify a value for a return, it’s the same as just returning the result of the previous step in the sequence – again, return is just identity, so passing something into return shouldn’t affect it.)
seq >>= return . f = fmap f seq
(composing return with a function is equivalent to invoking that function on the result of the monad sequence to that point, and wrapping the result in the monad – in other words, it’s just composition with identity.)
seq >>= (x -> f x >>= g) = (seq >>= f) >>= g
(Your implementation of “
>>=” needs to be semantically equivalent to the usual translation; that is, it must behave like a functor composition.)