# A Monad in OCaml It's an old tradition that any programmer who thinks they know something useful about monads eventually succumbs to the temptation to go off and write a blog post about their revelations . . . _Anyway_ . . . Lets take a look at a `Monad` definition in OCaml and walk through the clues that suggest a monad's implementation. In OCaml, abstract structures such as monads are typically best represented using [modules](https://www.pathsensitive.com/2023/03/modules-matter-most-for-masses.html). A module is essentially a record, containing types and terms, along with a manifest or interface that allows a programmer to selectively expose information about that module to the outside world and, dually, to selectively depend on particular characteristics of other modules. Modules provide programmers the machinery of composition and reuse and are the primary mechanism by which OCaml code is structured, neatly capturing the notion of a program abstraction boundary. ```ocaml module type Monad = sig type 'a t val return : 'a -> 'a t val bind : ('a -> 'b t) -> 'a t -> 'b t end ``` Above is a module _signature_. Signatures themselves can be thought of as relating to modules in much the same way that types relate to values (hence `module type` in the syntax): each one defines the set of all possible modules that comply with the structure it describes. In this case, we give the name "`Monad`" to the set of modules exposing _at least_ a type constructor `'a t`[^alpha], a function `return : 'a -> 'a t`, and a function `bind : ('a -> 'b t) -> 'a t -> 'b t`. Abstractly, these three items together are what constitute a monad. It's helpful to think about what each item means in general before examining them in more concrete terms. `t` is a function from types to types, also known as a type constructor. `list` and `option` both are examples of type constructors. The presence of `t` in our `Monad` signature--- specifically the fact that it's parametric, i.e. `'a t` rather than just `t` ---represents the idea that a monad is essentially a _context_ around underlying computations of an abstract type. For some particular `'a` and some particular module that fits the `Monad` signature above, `'a` is the type of the underlying computation; that is, `t` is the generic context itself, and `'a t` is an instance of that generic context which is specific to the inner type `'a`; `'a t` is the type of alphas in the `t` sort of context. Hopefully, one of that multifarious bundle of phrasings made at least a little bit of sense--- what exactly is meant by "context" is the key to this whole endeavor, but I'm going to avoid addressing it directly until we're a little further along. For now, let's consider `return`. If `t` is the generic context, then `return` is the function that makes it specific or "specializes" it to the type `'a` of some particular value `x : 'a`. This function takes an object[^object] of the base type `'a` and puts it into the context of `t`. The specialized context of the resulting `'a t` value will be in some sense basic, empty, default, null; it is the starting-point context that exists just to have `x` in it, so that computations involving `x` can take place in that sort of context later on. ```ocaml module ListMonad = struct type 'a t = 'a list let return : 'a -> 'a t = fun x -> [x] . . . end ``` Since `t` here is `list`, `return` is the function that takes an argument and sticks it into a list, i.e. `fun x -> [x]`. As you might guess, `list` forms a monad when equipped with suitable definitions of `return` and `bind` (the latter of which is omitted for now). The meaning of `list` as a monad--- that is, the context that `list` and its natural accompanying definitions of `bind` and `return` represent ---is interesting, broadly useful, and sufficiently non-obvious as to demand some intuition, so I'll use it as a running example. In its most natural interpretation, `list` represents--- or simulates[^physical] ---the property of [nondeterminism](https://en.wikipedia.org/wiki/Nondeterministic_Turing_machine), which is characteristic of a computational model in which all possible paths are taken _simultaneously_. A value of type `'a list` thus represents all possible results of a particular computation of type `'a`, with each result being a list element. Considered in this light, `[x]` is a value where only one path is taken, i.e. where no branches in execution are encountered. Examining the code above, notice how the implementation of `return` inherently gives rise to the "no branches" notion of the empty context, which is embedded in it by definition. That notion, that the null context means there are no branches, is specific to nondeterminism, and `return` is what encodes it into the formal structure of the `ListMonad` module. Finally, we move on to `bind`. `bind` is the driving force of monads; it performs the heavy lifting that makes them a useful tool for structuring algorithms. An implementation of `bind` is what captures the meaning of a particular sort of context and contextual data by encoding it into a `Monad` instance. Thus, it is `bind` _abstractly_, as it appears in the definition of the `Monad` signature, that captures what is meant by "context" in general. A context should thusly be thought of as some computation that is driven by, given its basic structure by, the underlying computation in `'a`. In other words, every time a program manipulates an `'a t`, some additional, implicit action[^action-std] is carried out alongside--- or possibly modifying ---that direct interaction with the context or its underlying data. This implicit action is embedded in the implementation of `bind`, and thus it is the `bind` function for a type constructor that fundamentally determines what is the context in question, what that context _means_ informally, and how it behaves. ```ocaml module ListMonad = struct type 'a t = 'a list let return x = [x] let rec bind (f : 'a -> 'b list) (xs : 'a list) : 'b list = match xs with | [] -> [] | x :: xs' -> f x @ bind f xs' end ``` Here is the completed implementation of the module `ListMonad`, in which we have implemented the `list` monad. A good way to think about what any implementation of bind is doing at a high level is that it 1. extracts the value of the underlying type `'a` from `xs`, 1. transforms it _via_ `f`, producing `'b` along with new context, and 1. uses that new context, along with the original context of `xs`, to determine the final context of the returned `'b t`. The "value of the underlying type" may be literally a single value of type `'a`, but it needn't be. In the body of `ListMonad.bind` above, we are actually extracting a whole list's worth of alphas, applying `f` to them as we recurse over the list structure--- these constitute the "underlying value" of the `'a list` `xs`. So, how are those ideas played out for lists? If `xs` is empty, `bind` returns the empty list; that's an obvious way to handle it. Otherwise, we have a recursive case. The list is _not_ empty here, so we can safely 1. take the first element `x` and the remaining elements `xs'`, 1. apply `f` to `x` to obtain a new `'b list`, and 1. append the result of `f x` to a recursive call `bind f xs'`. We know that the `bind` function returns a `'b list`, so we're appending the `'b list` `f x` to the `'b list` `bind f xs'`, thus obtaining the `'b list` that we return the caller. Pay careful attention to the parallels here. You may think we didn't use the context of the original `xs`, but we did! We recursed over the context, in fact; it determined the call structure of `bind`. *** [^physical]: Of course, we say that `list` _simulates_ nondeterminism for the same reason that we say physical computers simulate Turing machines: both are constrained by the resource limitations of physical reality and thus less capable than the theoretical devices they seem to emulate. [^alpha]: Pronounced "alpha tee". [^object]: "Object" in the general sense; nothing to do with object-orientation or kin. TODO: be explicit about how monads exist independently and we are _capturing_ them in the particular language of ocaml. `list` forms a monad whether we actually implement that monad, or not [^action-std]: This gives rise to a standard term. See [Monads as Computations](https://wiki.haskell.org/Monads_as_computation). So the list monad allows us to write non-determinsitic code in much the same style as we would write fundamentally simpler determinstic code (aside: this stratification is not theoretically needed, re 1ml)