To get started, what is category theory?

Back in grad school, I spent some time working with a thoroughly insane guy named John Case who was the new department chair. When he came to the university, he brought a couple of people with him, to take temporary positions. One of them was a category theorist whose name I have unfortunately forgotten. That was the first I’d ever heard of cat theory. So I asked John what the heck this category theory stuff was. His response was “abstract nonsense”. I was astonished; a guy as wacky and out of touch with reality as John called something abstract nonsense? It turned out to be a classic quotation, attributed to one of the founders of category theory, Norman Steenrod. It’s not an entirely bad description.

Category theory is one of those fields of mathematics that fascinates me: where you take some fundamental concept, and abstract it down to its bare essentials in order to understand just what it really is, what it really means. Just like group theory takes the idea of an algebraic operation, strip it down to the bare minimum, and discovering the meaning of symmetry; category theory looks at what happens if you take the concept of a function as a mapping from one thing to another, and strip it down to the bare minimum, and see what you can discover?

The fundamental thing in category theory is an *arrow*, also called a *morphism*. A morphism is an abstraction of the concept of *homomorphism*, which I talked about a bit when I was writing about group theory. We take that concept of a function mapping from one set of values to another, and strip it down: the basic concept is something that provides a map from one thing to some other thing – and that’s all we want.

In classic style, we’ll define a *category* C as a tuple: `(O, M, º)`

, where:

- O, or Obj(C), is a set of
*objects*. Objects are*anything*– we don’t care. Anything that we can define maps over. - M, or Mor(C) is a set of arrows or morphisms. Every arrow has a unique
*source*and a unique*target*, both of which are objects in M. Given two objects a and b in O, we also write Mor(a,b) for the set of morphisms from a to b. To talk about a specific morphism from a to b, we write it as “name : a → b”; as in “f : int → int”. - º is a binary operation that is the abstraction of function composition; º is a mapping from (Mor(a,b), Mor(b,c)) to Mor(a,c). It’s got the basic properties of function composition:
- Associativity: (∀ f : a → b, g : b → c, h : c → d) h º (g º f) = (h º g) º f
- Identity: (∀ a,b ∈ O(C)) (exists 1
_{a}, 1_{b}∈ Mor(C)) (∀ f : a → b) 1_{b}º f = f = f º 1_{a}.

One neat little trick to simplify things is that we can actually throw away Obj(C), and replace it with the set of identity morphisms: since there is exactly one identity morphism per object, there’s no real need to distinguish between the identity morphism and the object. It’s a nice trick because it means that we have nothing but morphisms all the way down; but it’s really just a trick. We can talk about Obj(C); or Id(C); but we still need to be able to talk about the objects in some way, whether they’re just a subset of the morphisms, or something distinct.

Now, we get to something about category theory that I really don’t like. Category theory is front-loaded with rather a lot of definitions about different kinds of morphisms and different kinds of categories. The problem with that is that these definitions are very important, but we don’t have enough of the theory under our belts to be able to get much of a sense for why we should care about them, or what their intuitive meaning is. But that’s the way it goes sometimes; you have to start somewhere. It *will* make sense eventually, and you’ll see why these definitions matter.

There are a lot of special types of morphisms, defined by properties. Here’s the basic list:

- A
*monic*(or monomorphism) is an arrow f : a → b such that (∀ g_{1}, g_{2}: x → a) f º g_{1}= f º g_{2}⇒ g_{1}= g_{2}. (That is, if any two arrows composed with f in f º g end up at the same object only if they are the same.) - An
*epic*(or epimorphism) is an arrow f : a → b such that (∀ g_{1}, g_{2}: b → x) g_{1}º f = g_{2}º f ⇒ g_{1}= g_{2}. (This is almost the same as a monic, but it’s from the other side of the composition; instead of (f º g) in the definition, it’s (g º f).) - An
*isomorphism*is an arrow f : a → b such that (∃ g : b → a) f º g = 1_{b}∧ g º f = 1_{a}. - An
*endomorphism*is an arrow f : a → b such that a = b. - An
*automorphism*is an arrow that is both an endmorphism and an isomorphism.

One last definition, just because it gives me a chance to point out something useful about category theory. A *functor* is a morphism in the category of all categories. What that means is that it’s a structure-preserving mapping between categories. It’s neat in a theoretical way, because it demonstrates that we’re already at a point where we’re seeing how category theory can make it easier to talk about something complicated: we’re using it to describe itself! But the concept of functor also has a lot of applications; in particular, the module system of my favorite programming language makes extensive use of functors.

In Ocaml, a module is something called a structure, which is a set of definitions with constrained types. One of the things you often want to be able to do is to write a piece of code in a way that allows you to make it parametric on some other structure. The way you do that is to write a functor: a “function” from a structure to a structure. For example, to implement a generic binary tree, you need a type of values that you’ll put in the tree; and an operation to compare values. The way you do that is to write a functor which takes a structure defining a type and a comparison operator, and mapping it to a structure which is an implementation of a binary tree for that type and comparison.

The Ocaml functor is a category theoretic functor: category theory provides an easy way to talk about the concept of the compile-time “function” from structure to structure.