Yet Another Monad Tutorial (part 2: >>= and return): mvanier

mvanier

Yet Another Monad Tutorial (part 2: >>= and return)

Jul 25, 2010 19:08

In the previous article I gave the conceptual background necessary to understand what monads are. Now I'm going to get into more of the details.
The two fundamental monadic operations

Remember when I said above that monads generalize function composition and function application? We'll work through that here. Have patience: it'll take a while.

By this point, I hope you have at least a vague sense of what monads "are" and what they are used for. However, as I said before, one of the keys to functional programming is the ability to compose functions to create new functions. Functional programmers talk about "composability" all the time, with the implication that if some aspect of a programming language isn't composable, it's probably not worth much. So if our newly-minted monadic functions were not composable, they wouldn't be nearly as useful as they would be if they were. But as we'll see, they aren't composable using the standard Haskell function composition operator. Something more will be needed, and this will lead us to derive the two fundamental monadic operations (or at least their types).

Let's say that we have two monadic functions:

f :: a -> m b g :: b -> m c
for some monad m. If you want a more specific example, you can imagine that f and g are in the IO monad, so we'd have

f :: a -> IO b g :: b -> IO c
but the same argument will apply for all monads. Remember that (for the IO case) the function f takes a value of type a and outputs a value of type b, possibly doing some (file or terminal) I/O along the way. Similarly, g takes a value of type b and outputs a value of type c, possibly doing some I/O along the way. Therefore, if we wanted to compose them, we'd hopefully end up with a function like this:

h :: a -> IO c
i.e. a function that takes a value of type a, outputs a value of type c, and possibly does some I/O along the way (with the I/O somehow being the combination of the I/O activity for functions f and g). We can write this out as follows:

compose: (f :: a -> IO b) with: (g :: b -> IO c) to get: (h :: a -> IO c)
However, our normal Haskell function composition operators won't work for this purpose, because they don't want the IO in the types. Let's compare with similarly-typed pure functions p, q, and r that don't do I/O:

p :: a -> b q :: b -> c r :: a -> c
Then we could compose them using either the (.) or the (>.>) operator as described above:

(.) :: (b -> c) -> (a -> b) -> (a -> c) r = q . p (>.>) :: (a -> b) -> (b -> c) -> (a -> c) r = p >.> q
Neither (.) or (>.>) will work with our monadic functions:

f :: a -> IO b g :: b -> IO c h :: a -> IO c g . f --> type error! mismatch between IO b and b f >.> g --> type error! mismatch between IO b and b
The point is, you can't use a monadic value of type IO b when a type of b is needed. (This is a very common bug when writing monadic Haskell programs.) What we want is a special monadic composition function which I'll call mcompose (standing for "monadic compose") which has the following type signature:

mcompose :: (a -> m b) -> (b -> m c) -> (a -> m c)
This will work for any monad m, including the IO monad. Specialized to the IO monad, it will have the following type signature:

mcompose :: (a -> IO b) -> (b -> IO c) -> (a -> IO c)
Then we could use it like this:

f :: a -> IO b g :: b -> IO c h :: a -> IO c h = f `mcompose` g
and h would have the correct type signature. (We're using a spiffy syntactic feature of Haskell here, whereby any two-argument function can be turned into an infix operator by putting backquotes around it. Remember, operators in Haskell are just functions which happen to be placed between their operands.) Somehow, through (currently) mysterious means, the mcompose function (or operator, if you like) is able to

take the original input value of type a
apply f to it (this is just normal function application) to get a result of type IO b
take the value of type IO b output from f and extract the value of type b (this is what we couldn't do before)
take the value of type b and apply g to it (again, this is just normal function application) to get the value of type IO c, which is the result.

The only thing we can't already do is step (3), extracting a value of type b from a value of type IO b. Now, we could do this if we had a function called extract with this type:

extract :: IO b -> b
or more generally for arbitrary monads,

extract :: m b -> b
It turns out that such a function, if it existed, would destroy all the advantages of monads and pure functional programming! One of the reasons we wanted monads in the first place was to keep these special notions of computation (monadic functions) separate from normal (pure) functions, because otherwise there would be no way to guarantee that pure functions were in fact pure. This is an important point, so I'm going to spend a little bit of time on it, after which we'll return to monadic composition.

Side note: In fact, some monads do have the equivalent of an extract function, and for most of those monads it doesn't cause problems. All I'm saying is that a generic extract function that works for all monads is not allowed.

What we would like is to ensure that functions that have non-monadic type signatures are pure functions. Now, in a sense, even our monadic functions are pure functions, because they are implemented in Haskell as pure functions that return monadic values. But what we want to guarantee is that non-monadic (pure) functions don't even do that i.e. don't even return monadic values. If that's the case, they are certainly going to be pure functions. So a pure function hh of type

hh :: a -> c
should never do (file or terminal) input/output, for instance, because if it did it would be required by the type system to have the type

hh :: a -> IO c
instead. Guarantees like this, enforced by the type system, are one of the major strengths of Haskell. They allow us to glance at the type of a function and be 100% sure that that function doesn't do input/output, for instance.

However, if we had the extract function, we could compose the supposedly pure function hh it out of monadic functions that do I/O like this:

ff :: a -> IO b gg :: b -> c hh = ff >.> extract >.> gg -- or equivalently: hh = gg . extract . ff
So even though hh is never supposed to be doing I/O, if there was an extract function then you could build an hh function using normal function composition, it would have the type signature of a pure function, and yet it would do I/O. So much for separating I/O (and other monadic computations) from pure computations (recall that this was one of the main reasons for wanting monads in the first place). Note, by the way, that this is exactly the situation in most conventional programming languages, which is why the type systems of those languages can offer no guarantees that a function is pure. In Haskell we like pure functions and we use the type system to give us guarantees that pure functions are actually pure - and that means no extract function.

There's one slight problem with what I just said: technically, it's a lie. There is a function called unsafePerformIO that has the type IO a -> a i.e. it's an extract function for the IO monad only. The word "unsafe" is a clue that tells you that you should avoid using it unless you know exactly what you're doing and are prepared for weird failures. I myself have never needed to use unsafePerformIO, but there are legitimate uses for it (for instance, deep down in the implementation of Haskell compilers). Just forget I even brought this up, OK? It's embarrassing. Excuse me while I go wash my hands.

OK, I'm back. So far, we've established that (a) we want to be able to compose monadic functions, (b) we can't do that with normal function composition in Haskell because we can't convert monadic types into regular types, and (c) we can't define an extract function to do that conversion, because that would screw up the purity of the rest of the language. So what do we do?

Well, first of all, note that we can get by with something simpler than an mcompose function. Let's say we had an mapply (monadic apply) function that had this type signature:

mapply :: m b -> (b -> m c) -> m c
or, more specifically for the IO monad:

mapply :: IO b -> (b -> IO c) -> IO c
It's called mapply because it's very similar to the regular function application operators. For instance, recall the >$> operator we defined previously, which had this type signature (using b and c instead of a and b for type variables):

(>$>) :: b -> (b -> c) -> c
This is the same as mapply except that the ms are gone (the types are not monadic types). With mapply, we could trivially define mcompose as follows:

mcompose :: (a -> m b) -> (b -> m c) -> (a -> m c) mcompose f g x = (f x) `mapply` g -- or: mapply (f x) g
Note that since the -> associates to the right in type signatures, the type signature of mcompose can be written without the final set of parentheses as:

mcompose :: (a -> m b) -> (b -> m c) -> a -> m c mcompose f g x = (f x) `mapply` g
This may be easier to understand than the previous version, but they are equivalent. Note that x has type a and the result has type m c. So what we're doing here is applying f to x to get a value of type m b, and using mapply on the m b value and the g function to get a value of type m c. So the upshot is, we don't need mcompose to be defined for us if we have mapply, because we can use mapply to define mcompose ourselves. And, in fact, mapply is one of the two fundamental monadic operations. It's normally called "bind" and is written as an infix operator with the symbol >>= as follows:

(>>=) :: m a -> (a -> m b) -> m b
Note that I did a switch in the type signature, using a in place of b and b in place of c. It doesn't matter since a, b, and c are type variables - they work for any types.

I'd just like to point out here that >>= has an incredibly abstract type. Its first argument is a value of type m a, where a can be any type at all and m is any monadic type constructor whatsoever. The second argument is a function of type a -> m b, where a and b can be any types at all and m is again any monadic type constructor. The return value has type m b, where again b can be any type and m is any monadic type constructor. When you program in Haskell for long enough, this kind of type signature becomes second nature, but it can be intimidating to new Haskell programmers. If you specialize it to the IO monad, you get:

(>>=) :: IO a -> (a -> IO b) -> IO b
which, of course, is the type signature of an IO-specific monadic apply operator. We'll see below that Haskell's type class mechanism allows us to use the same operator name >>= for all the different specializations of this operator to different monads (how cool is that?).

Assuming we have the >>= operator, we can now compose f and g to get h as follows:

-- assume we have: f :: a -> m b g :: b -> m c -- definition of h: h :: a -> m c h x = f x >>= g
We can also write h directly as:

h = \x -> f x >>= g
where the \x -> ... is, as I mentioned above, Haskell's notation for an anonymous function (in this case with a single argument x); both versions of h mean the same thing. Using mcompose we can write this as:

h = f `mcompose` g = mcompose f g = \x -> (f x >>= g)
Our definition of mcompose is thus just:

mcompose f g = \x -> (f x >>= g)
and in fact, Haskell has a standard operator for monadic composition called >=>:

f >=> g = \x -> (f x >>= g) -- same as (f `mcompose` g) but more concise
So, assuming we have this monadic apply operator >>=, we can easily define the monadic composition operator >=>. So the monadic apply operator (the bind operator) is the important concept here. As we'll see, each individual monad has to define its own specific version of this operator, which will be different from every other monad's version. That's where Haskell's type classes will come in very handy. Incidentally, in the ghc Haskell compiler, the >=> operator is defined in the Control.Monad module.

Now remember that we could write the normal apply operator in two ways:

($) :: (a -> b) -> a -> b
and

(>$>) :: a -> (a -> b) -> b
depending on what order we wanted the arguments to be in. (Of course, it's also fine to define both operators and use whichever one is most convenient in any given situation.) Similarly, we can write the monadic apply operator in two ways. The first way is as the bind operator >>= with type

(>>=) :: m a -> (a -> m b) -> m b
which is analogous to the non-monadic >$> apply operator. We can also trivially define a monadic apply operator that takes its operands in the reverse order:

(=<<) :: (a -> m b) -> m a -> m b f =<< x = x >>= f
You can also use the flip function, which takes a function of two arguments and returns a new function which is the same as the old one except that it takes the arguments in the reverse order:

flip :: (a -> b -> c) -> (b -> a -> c) flip f = \x y -> f y x
Then we can define =<< as follows:

(=<<) = flip (>>=)
You get extra points for functional coolness if you write concise definitions like this.

Similarly again, we can define a monadic composition operator that takes its operands in the reverse order:

(>=>) :: (a -> m b) -> (b -> m c) -> (a -> m c) -- already defined (<=<) :: (b -> m c) -> (a -> m b) -> (a -> m c) (<=<) = flip (>=>)
So just as was the case for the regular (non-monadic) apply and compose operators, we can define monadic apply and compose operators which take their operands in whichever order we want. In practice, though, the monadic operator Haskell programmers seem to use the most is the >>= operator (or at least, it's the one I use the most).

If you've understood everything so far, congratulations! It's all downhill from here. Or so I hope.

There is one more fundamental monadic operation I need to talk about. To motivate it, consider this scenario. You want to compose a monadic function with a non-monadic function. In other words, you have these functions

f :: a -> m b -- monadic g :: b -> c -- non-monadic
The problem is this: you can't use regular function composition to compose f and g, because m b is not the same type as b. And you can't use monadic function composition either, because g doesn't have the type b -> m c, which is what monadic composition would require. So what can you do?

If we had the extract function I described above, you could compose the two functions the way I showed there:

h :: a -> c h = f >.> extract >.> g
but as I mentioned, you're not allowed to do this. In other words, you're not allowed to compose a monadic function with a non-monadic function to get a non-monadic function (because that would screw up the functional purity of the language). What you are allowed to do is to compose a monadic function with a non-monadic function to get a monadic function, which would work like this:

h :: a -> m c h = f [somehow composed with] g
Now, we know we can't use our monadic composition operator for this, because g doesn't have the right type (which would be b -> m c). But we could use the monadic composition operator if we had some way to convert a non-monadic function to a monadic one. In other words, if we had a function functionToMonadicFunction with this type signature:

functionToMonadicFunction :: (b -> c) -> (b -> m c)
then we could define h as:

h :: a -> m c h = f >=> (functionToMonadicFunction g)
It turns out that all you need in order to define functionToMonadicFunction is an even simpler thing, which is a monadic function with the (possibly confusing) name of return. It has the following type signature:

return :: a -> m a
for any type a and any monadic type constructor m. What return does is convert a regular value into the corresponding monadic value for a given monad m. We'll see the specifics of this below.

If you have return, then functionToMonadicFunction can trivially be defined as:

functionToMonadicFunction :: (a -> b) -> (a -> m b) functionToMonadicFunction f = \x -> return (f x)
or, if I wanted to be cool and use function composition, as:

functionToMonadicFunction :: (a -> b) -> (a -> m b) functionToMonadicFunction f = return . f
or even as:

functionToMonadicFunction :: (a -> b) -> (a -> m b) functionToMonadicFunction = (return .)
using a cool syntactic feature of Haskell called operator sections. All three functions are equivalent.

Note that I once again switched b for a and c for b in the type signature of functionToMonadicFunction; again, it doesn't matter. The point is, with this return function, we can now compose monadic functions with non-monadic ones to create new monadic functions. And return is the second fundamental monadic operation.

Side note: If you've done a lot of imperative programming, you will probably find the name return more than a little annoying at first. Just remember that it is not a keyword in Haskell and it has nothing to do with returning from a function. So try to keep those ideas out of your head when dealing with return.

The name "return" actually comes from the characterization of monadic values as "actions"; seen in that light, the function return takes a value and produces a monadic value which is an "action" which "does something" and then "returns" the original value unchanged. Note also that return is in fact a monadic function. Putting these two ideas together, you can see (or at least guess) that return is the monadic version of the identity function (the function that maps a value to itself, i.e. \x -> x). We'll see this again later when I talk about monad laws.

Let's put return to work composing our monadic function f with the non-monadic function g to get the monadic function h. Here's the definition:

h = f >=> (return . g)
because, as we saw above, return . g will convert g into a monadic function.

After all this, you might wonder how many more monadic operations we're going to have to plow through before we're done defining them all. As Professor Farnsworth would say: Good news, everybody! There are only two! There are also a couple of non-critical operations that we will eventually want to define for convenience, but >>= and return are the only ones that absolutely have to be there.

There is one rather peculiar aspect of return. We say that return has the type a -> m a, but when we say e.g. return 10, what is the type of the output? It could have the type IO Int, or Maybe Int or some other monadic type involving Int. How do we know which of the many possibilities is the correct one? Note, by the way, that the monadic value of type IO Int is a completely different value than the monadic value of type Maybe Int, so it's not just about getting the right type - it's not even obvious what kind of value return 10 is!

In Haskell, this is worked out by the context in which return 10 is found. The type checker has to make sure that all the functions get the right type of input arguments, and so if return 10 is the input to a function expecting a value of type IO Int, the type checker will decide that return 10 has the type IO Int (and similarly for other monads). Put differently, the value computed by return 10 depends on the type it has to have according to its context. If you want to, you can annotate return 10 with the type you want it to have by writing (return 10 :: IO Int), for instance, but this is rarely necessary.

To recap this section:

There are two fundamental monadic operations, called "bind" (the >>= operator) and return.
The bind (>>=) operator is a monadic apply operator. It can be used to define a monadic composition operator, which is written >=>.
The return operator transforms regular values into monadic values. It can be used to define a function to convert regular functions into monadic functions.

What do monadic application and composition mean?

By now, you should have a reasonable understanding of the mechanics of monadic composition and monadic application, but that's not the same thing as understanding what they mean intuitively. So let's expand on this a bit.

I said above that we can't define an extract function which takes a monadic value and returns a regular value. However, if we want to compose two monadic functions to get a third using the monadic composition operator, we need some way to extract values from monadic values.

(>=>) :: (a -> m b) -> (b -> m c) -> (a -> m c) f >=> g = {- whatever -}
Here we see that the function f takes in a value of type a and returns a monadic value of type m b, whereas g takes in a value of type b and returns a value of type m c. So it certainly looks like some "unpacking" of the monadic value into a regular value must be happening for this to be possible. As we showed above, we can define the monadic composition operator >=> in terms of the monadic apply operator >>=, so we can just look at >>=.

(>>=) :: m b -> (b -> m c) -> m c mv >>= g
where mv refers to some monadic value of type m b. But again, without an extract function, how do we get the value of type b out of the value of type m b so we can pass it to the function g?

The answer is different for every monad. Every monad has its own way of "unpacking" a value from a monadic value and passing it on to a monadic function. Put differently, every monad has to define its own monadic apply operator >>=, and the details of how a monadic value gets unpacked and passed to a monadic function are handled in the definition of that operator. Similarly, every monad has to define its own return function.

From now on, though, when dealing with arbitrary monads, I'll use the following terminology. The >>= operator takes as its input a monadic value (also known as an "action"), "unpacks" a regular (non-monadic) value from it (the details of which are specific to the particular monad), and passes that value as the input to the monadic function, which produces a monadic value ("action") as its final result.

Now that we've got that straight, let's talk about the Monad type class. Later, we will go over the definition of the >>= operator for several different monads, and you'll see for yourself how the unpacking is done in those cases.
The Monad type class

I've mentioned above that >>= is "the" monadic apply operator, and return is "the" function which converts a value into (any kind of) monadic value. But this is sloppy terminology, since, as I've also mentioned, every monad has to define its own version of these operators/functions, and these are completely different operators/functions for each monad. On the other hand, it would be a pain to have to use completely different names for every version of >>= and return; we'd need to have nasty stuff like:

IO>>= :: IO a -> (a -> IO b) -> IO b IOreturn :: a -> IO a Maybe>>= :: Maybe a -> (a -> Maybe b) -> Maybe b Maybereturn :: a -> Maybe a
and these function names aren't even legal Haskell syntax, because you can't mix symbolic and non-symbolic characters in identifiers and function names can't start with capital letters.

Haskell handles this kind of problem elegantly by using what is called a "type class". (Remember, I said that this tutorial would be easier to understand if you already knew about type classes. If you don't, be aware that this has absolutely nothing, zero, zilch to do with the notion of classes in object-oriented programming; a Haskell class is more like a compile-time interface.) A type class is a way of saying that a bunch of different types have different definitions for the same named functions (or operators) which have particular types. So, for instance, the Eq type class defines an operator called == which has the type a -> a -> Bool (where each a represents the same type). This says that for type a to be in type class Eq, it has to define what == (equality comparison) means for that type. The numeric types Int and Float are both in type class Eq (they are instances of type class Eq), and thus they have to define what == means for those types. We write this as follows in Haskell:

class Eq a where (==) :: a -> a -> Bool instance Eq Int where (==) = intEquals -- intEquals has type (Int -> Int -> Bool) instance Eq Float where (==) = floatEquals -- floatEquals has type (Float -> Float -> Bool)
assuming that intEquals is the equality function for Ints, and floatEquals is the equality function for Floats. (There is also an inequality operator I've left out, but the idea is the same.) What this does is allow us to use the same operator == when comparing Ints, or Floats, or any type which is an instance of the type class Eq. This is a great notational convenience. Note, though, that == requires that both its arguments have the same types, so it won't allow you to compare e.g. an Int and a Float. (Nor should you want to!)

Now, I mentioned above that all monads in Haskell are type constructors, and we've seen that all monads have to provide independent definitions for the >>= operator and the return function. Putting this all together, we find that in Haskell, there is a type class called (you guessed it) Monad which looks basically like this:

class Monad m where (>>=) :: m a -> (a -> m b) -> m b return :: a -> m a
This is actually a little more complicated than the Eq type class. There are two functions/operators, but that's no big deal, since we already knew that we'd have to define two functions for every monad. The types of the two functions/operators are familiar from the discussion above.

The weird part is that Monad isn't a type class the way Eq is. Monad is what's called a "constructor class" which means that its instances (called m here) are not types but type constructors - and we saw above that all monads have to be type constructors, so this has to be the case. To define an instance of a constructor class, we'd write this (using Maybe as our example instance):

instance Monad Maybe where (>>=) = {- the Maybe version of (>>=) -} return = {- the Maybe version of return -}
The fact that the same notation is used for regular classes and constructor classes can be a bit confusing, but it works out well enough in practice. Perhaps Haskell would be more precise if it used this notation:

constructorClass Monad m where (>>=) :: m a -> (a -> m b) -> m b return :: a -> m a constructorInstance Monad Maybe where (>>=) = {- the Maybe version of (>>=) -} return = {- the Maybe version of return -}
but that's pretty verbose and you can usually tell what's going on from context anyway.

Now let's look at a simple example and then continue talking about the Monad type class.
An example

One of the simplest examples of a function in the IO monad is a function which reads a line from the terminal and prints it right back out (with a newline at the end). The natural way to write this is to use the getLine and putStrLn functions. As we saw above, they have the types:

getLine :: IO String putStrLn :: String -> IO ()
We can't use the monadic composition operator >=> to combine these, because getLine is a monadic value, not a monadic function. But we can use the monadic apply operator >>= as follows:

readAndPrintLine = getLine >>= putStrLn
All we are doing here is (monadically) applying the putStrLn monadic function to the getLine monadic value. Using the "action" terminology we discussed above, we can think of this as follows: getLine is an "action" which reads a line of text input to the terminal and "returns" it as a monadic value, and the >>= operator "unpacks" that line of text from the monadic value, passing it to putStrLn, which prints it back to the terminal and returns nothing (actually it returns the meaningless () value as a monadic value).

We could have written this out more explicitly as follows:

readAndPrintLine = getLine >>= (\s -> putStrLn s)
Notice that (\s -> putStrLn s) is exactly the same function as putStrLn, in the same way that (\x -> cos x) is the same function as just cos. So we haven't changed anything important here. But writing it this way makes it clearer that something (a text string) is somehow being returned from getLine, and what the >>= operator is doing is taking that string (called s here) and printing it.

We'll expand on this simple example with some variations once we've covered some more features of the Monad type class, which we will do in the next article of this series.