# Category Theory Lesson 2: Basics of Categorical Abstraction

In my last post about category theory, I introduced the basic idea of typeclasses, and showed how to implement an algebraic monoid as a typeclass. Then we used that algebraic monoid as an example of how to think with arrows, and built it up into a sketch of category theory’s definition of a monoid. We ended with an ominous looking diagram that illustrated what a categorical monoid looks like.

In this post, we’re going to take a look at some of the formal definitions of the basic ideas of category theory. By the end of this lesson, you should be able to look at the categorical monoid, and understand what it means. But the focus of this post will be on understanding initial and terminal objects, and the role they play in defining abstractions in category theory. And in the next post, we’ll see how abstractions compose, which is where the value of category theory to programmers will really become apparrent.

Before I really get started: there’s a lot of terminology in category theory, and this post has a lot of definitions. Don’t worry: you don’t need to remember it all. You do need to understand the concepts behind it, but specifically remembering the difference between, say, and endomorphism, and epimorphism, and a monomorphism isn’t important: you can always look it up. (And I’ll put together a reference glossary to help make that easy.)

## Defining Categories

A category is basically a directed graph – a bunch of dots connected by arrows, where the arrows have to satisfy a bunch of rules.

Formally, we can say that a category consists of a the parts: $(O, M, \circ)$, where:

• $O$ is a collection of objects. We don’t actually care what the objects are – the only thing we can do in category theory is look at how the objects related through arrows that connect them. For a category $C$, we’ll often call this collection Obj(C).
• $M$ is a collection of arrows, often called morphisms. Each element of $M$ starts at one object called its domain (often abbreviated dom), and ending at another object called its codomain (abbreviated cod). For an arrow $f$ that goes from $a$ to $b$, we’ll often write it as $f:a \rightarrow b$. For a category $C$, we’ll often call this set Mor(C) (for morphisms of C).
• $\circ$ is a composition operator. For every pair of arrows $f:a \rightarrow b$, and $g:b \rightarrow c$, there must be an arrow $g\circ f: a \rightarrow c$ called the compositions of $f$ and $g$.
• To be a category, these must satisfy the following rules:
1. Identity: For every object $o \in Obj(C)$, there must be an arrow from $o$ to $o$, called the identity of $o$. We’ll often write it as $id_o:o \rightarrow o$. For any arrow $f: x \rightarrow o, f \circ o=f$; and for any arrow $g:o \rightarrow y: o \circ g=g$. That’s just a formal way of saying that composing an identity arrow with any other arrow results in the the other arrow.
2. Associativity: For any set of arrows $f:w \rightarrow x, g:x \rightarrow y, z:y \rightarrow z: h \circ (g\circ f) = (h\circ g)\circ f$.

When talking about category theory, people often say that an arrow is a structure preserving mapping between objects. We’ll see what that means in slightly more detail with some examples.

A thing that I keep getting confused by involves ordering. Let’s look at a quick little diagram for a moment. The path from X to Z is $g \circ f$ – because $g$ comes after $f$, which (at least to me) looks backwards. When you write it in terms of function application, it’s $g(f(x))$. You can read $g \circ f$ as g after f, because the arrow $g$ comes after the arrow $f$ in the diagram; and if you think of arrows as functions, then it’s the order of function application.

### Example: The category Set

The most familiar example of a category (and one which is pretty canonical in category theory texts) is the category Set, where the objects are sets, the arrows between them are total functions, and the composition operator is function composition.

That might seem pretty simple, but there’s an interesting wrinkle to Set.

Suppose, for example, that we look at the function $f(x)=x^2$ . That’s obviously a function from to Int to Int. Since Int is a set, it’s also an object in the category Set, and so $f(x)=x^2$ is obviously an arrow from $Int \rightarrow Int$. .But there’s also a the set Int+, which represents the set of non-negative real numbers. $f(x)=x^2$ is also a function from Int+ to Int+. So which arrow represents the function?

The answer is both – and many more. (It’s also a function from the reals to complex numbers, because every real number is also a complex number.) And so on. A function isn’t quite an arrow: an arrow is a categorical concept of some kind of mapping between two objects. In many ways, you can think of an arrow as something almost like a function with an associated type declaration: you can write many type declarations for a given function; any valid function with a type declaration that is an arrow in Set.

We’ll be looking at Set a lot. It’s a category where we have a lot of intuition, so using it as an example to demonstrate category concepts will be useful.

### Example: The category Poset

Poset is the category of all partially ordered sets. The arrows between objects in posets are order-preserving functions between partially ordered sets. This category is an example of what we mean by structure-preserving mappings: the composition operator must preserve the ordering property.

For that to make sense, we need to remember what partially ordered set is, and what it means to be an order preserving function.

• A set $S$ is partially ordered if it has a partial less-than-or-equal relation, $\le$. This relation doesn’t need to be total – some values are less than or equal to other values; and some values can’t be compared.
• A function between two partially ordered sets $f:A \rightarrow B$ is order-preserving if and only if for all values $x, y \in A$, if $x \le y$ in $A$, then $f(x)\le f(y)$ in $B$.

The key feature of an object in Poset is that is possesses a partial ordering. So arrows in the category must preserve that ordering: if $x$ is less than $y$, then $f(x)$ must be less than $f(y)$.

That’s a typical example of what we mean by arrows as structure preserving: the objects of a category have some underlying structural property – and to be an arrow in the category, that structure must be preserved across arrows and arrow composition.

## Commuting Diagrams

One of the main terms that you’ll hear about category diagrams is about whether or not the diagram commutes. This, in turn, is based on arrow chasing.

An arrow chase is a path through the diagram formed by chaining arrows together by composing them – an arrow chase is basically discovering an arrow from one object to another by looking at the composition of other arrows in the category.

We say that a diagram $\textbf{D}$ commutes if, for any two objects $x$ and $y$ in the diagram, every pair of paths between $x$ and $y$ compose to the same arrow. Another way of saying that is that if $P(x, y)$ is the set of all paths in the diagram between $x$ and $Y$, $\forall p_i, p_j \in P(x, y),: \circ(p_i) = \circ(p_j)$.

For example: In this diagram, we can see two paths: $f\circ h$ and $g\circ h$. If this diagram commutes, it means that following $f$ from $A$ to $B$ and $h$ from $B$ to $C$ must be the same thing as following $g$ from $A$ to $B$ and $h$ from $B$ to $C$. It doesn’t say that $f$ and $g$ are the same thing – an arrow chase doesn’t tell us anything about single arrows; it just tells us about how they compose. So what we know if this diagram commutes is that $f\circ h=g \circ h$.

### Diagrams and Meta-level reasoning: an example

Let’s look at a pretty tricky example. We’ll take our time, because this is subtle, but it’s also pretty typical of how we do things in category theory. One of the key concepts of category theory is building a category, and then using the arrows in that category, create a new category that allows us to do meta-level reasoning.

We’ve seen that there’s a category of sets, called Set.

We can construct a category based on the arrows of Set, called Set. In this category, each of the arrows in Set is an object. So, more formally, if $f: A \rightarrow B \in \text{Mor}(\textbf {Set})$ then $f:A \rightarrow B \in \text{Obj}(\textbf{ Set}^{\rightarrow})$.

The arrows of this new category are where it gets tricky. Suppose we have two arrows in Set, $f: A \rightarrow B$ and . These arrows are objects in $\textbf{Set}^{\rightarrow}$} There is an arrow from $f$ to in $\text{Mor}(\textbf{ Set}^{\rightarrow})$ if there is a pair of arrows $a$ and $b$ in $\text{Mor}(\textbf{Set})$ such that the following diagram commutes:

The diagram is relatively easy to read and understand; explaining it in works is more complicated:

• an arrow in our category of Set-arrows is a mapping from one Set-arrow $f$ to another Set-arrow .
• That mapping exists when there are two arrows $a$ and $b$ in $\text{Mor}(\textbf{Set})$ where:
• $a$ is an arrow from the domain of $f$ to the domain of ;
• $b$ is an arrow from the codomain of $f$ to the codomain of ; and
• .

Another way of saying that is that there’s an arrow means that there’s a structure-preserving way of transforming any arrow from $A\rightarrow B$ into an arrow from .

Why should we care about that? Well, for now, it’s just a way of demonstrating that a diagram can be a lot easier to read than a wall of text. But this kind of categorical mapping will become important later.

## Categorizing Things

As I said earlier, category theory tends to have a lot of jargon. Everything we do in category theory involves reasoning about arrows, so there are many terms that describe arrows with particular properties. We’ll look at the most basic categories now, and we’ll encounter more in later lessons.

### Monics, Epics, and Isos

The easiest way to think about all of these categories is by analogy with functions in traditional set-based mathematics. Functions and their properties are really important, so we define special kinds of functions with interesting categories. We have injections (functions from A to B where every element of A is mapped onto a unique element of B), surjections (functions from A to B where each element of B is mapped onto by an element of A), and isomorphisms.

In categories, we define similar categories: monomorphisms (monics), epimorphisms (epics), and isomorphisms (isos).

• An arrow $f:Y \rightarrow Z$in category C is monic if for any pair of arrows $g:X \rightarrow Y$ and $h:X \rightarrow Y$ in C, $f\circ g = f\circ h$ implies that $g = h$. (So a monic arrow discriminates arrows to its domain – every arrow to its domain from a given source will be mapped to a different codomain when left-composed with the monic.)
• An epic is almost the same, except that it discriminates with right-composition: An arrow $f:X \rightarrow Y$ in category C is epic if for any pair of arrows $g:Y \rightarrow Z$ and $h:Y \rightarrow Z$ in C, $g\circ f = h\circ f$ implies that $g = h$. (So in the same way that a monic arrow discriminations arrows to its domain, an epic arrow discriminates arrows from its codomain.)

These definitions sound really confusing. But if you think back to sets, you can twist them into making sense. A monic arrow $f:Y \rightarrow Z$ describes an injection in set theory: that is, a function maps every element of $X$ onto a unique element of $Y$. So if you have some functions $g$ and $h$ that maps from some set $A$ onto $Y$, then the only way that $f\circ g$ can map onto $Z$ in the same way as $f\circ h$ is if $g$ and $h$ map onto $Y$ in exactly the same way.

The same basic argument (reversed a bit) can show that an epic arrow is a surjective function in Set.

• An isomorphism is a pair of arrows $f:Y \rightarrow Z$ and $f^{-1}: Z \rightarrow Y$ where $f$ is monic and $f^{-1}$is epic, and where $f\circ f^{-1}= id_Z$, and $f^{-1}\circ f = id_Y$.

We say that the objects $Y$ and $Z$ are isomorphic if there’s an isomorphism between them.

### Initial and Terminal Objects

Another kind of categorization that we look at is talking about special objects in the category. Categorical thinking is all about arrows – so even when we’re looking at special objects, what make them special are the arrows that they’re related to.

An initial object 0 in a category $C$ is an object where for every object $c \in \text{Obj}(\textbf{C})$, there’s exactly one arrow $0_c \in \text{Mor}(\textbf{C})$. Similarly, a terminal object $1$ in a category ${\textbf C}$ is an object where for every object $c \in \text{Obj}(\textbf{C})$, there is exactly one arrow $1_c \in \text{Mor}(\textbf{C})$.

For example, in the category Set, the empty set is an initial object, and singleton sets are terminal objects.

## A brief interlude:

What’s the point?In this lesson, we’ve spent a lot of time on formalisms and definitions of abstract concepts: isos, monos, epics, terminals. And after this pause, we’re going to spend a bunch of time on building some complicated constructions using arrows. What’s the point of all of this? What does any of these mean?

Underlying all of these abstractions, category theory is really about thinking in arrows. It’s about building structures with arrows. Those arrows can represent import properties of the objects that they connect, but they do it in a way that allows us to understand them solely in terms of the ways that they connect, without knowing what the objects connected by the arrows actually do.

In practice, the objects that we connect by arrows are usually some kind of aggregate: sets, types, spaces, topologies; and the arrows represent some kind of mapping – a function, or a transformation of some kind. We’re reasoning about these aggregates by reasoning about how mappings between the aggregates behave.

But if the objects represent some abstract concept of collections or aggregates, and we’re trying to reason about them, sometimes we need to be able to reason about what’s inside of them. Thinking in arrows, the only way to really be able to reason about a concept like membership, the only way we can look inside the structure of an object, is by finding special arrows.

The point of the definitions we just looked at is to give us an arrow-based way of peering inside of the objects in a category. These tools give us the ability to create constructions that let us take the concept of something like membership in a set, and abstract it into an arrow-based structure.

Reasoning in arrows, a terminal object is an object in a category that captures a concept of a single object. It’s easiest to see this by thinking about sets as an example. What does it mean if an object, T, is terminal in the category of sets?

It means that for every set $S$, there’s exactly one function from $S$ to $T$. How can that be? If $T$ is a set containing exactly one value $t$, then from any other set $S$, the only function from $S \rightarrow T$ is the constant function $f(x) = t$. If $T$ had more than one value in it, then it would be possible to have more than one arrow from $S$ to $T$ – because it would be possible to define different functions from $S$ to $T$.

By showing that there’s only one arrow from any object in the category of sets to T, we’re showing that can’t possibly have more than one object inside of it.

Knowing that, we can use the concept of a terminal object to create a category-theoretic generalization of the concept of set membership. If $s$ is an element of a set $S$, then that set membership can be represented by the fact that there is an arrow from the terminal object ${c}$ to $S$. In general, for any object $S$ in a category, if there is an arrow from a terminal object ${t}$ to $S$, then in some sense, $t \in S$.

## Constructions

We’re finally getting close to the real point of category theory. Category theory is built on a highly abstracted notion of functions – arrows – and then using those arrows for reasoning. But reasoning about individual arrows only gets you so far: things start becoming interesting when you start constructing things using arrows. In lesson one, we saw a glimpse of how you could construct a very generalized notion of monoid in categories – this is the first big step towards understanding that.

### Products

Constructions are ways of building things in categories. In general, the way that we work with constructions is by defining some idea using a categorical structure – and then abstracting that into something called a universal construction. A universal construction defines a new category whose objects are instances of the categorical structure; and we can understand the universal construction best by looking at the terminal objects in the universal construction – which we can understand as being the atomic objects in its category.

When we’re working with sets, we know that there’s a set-product called the cartesian product. Given two sets, A and B, the product $A \times B={(a, b) : a \in A, b \in B}.$

The basic concept of a product is really useful. We’ll eventually build up to something called a closed cartesian category that uses the categorical product, and which allows us to define the basis of lambda calculus in category theory.

As usual, we want to take the basic concept of a cartesian product, and capture it in terms of arrows. So let’s look back at what a cartesian product is, and see how we can turn that into arrow-based thinking.

The simple version is what we wrote above: given two sets A and B, the cartesian product maps them into a new set which consists of pairs of values in the old set. What does that mean in terms of arrows? We can start by just slightly restating the definition we gave above: For each unique value $a \in A$, and each unique value $b \in B$, there’s a unique value $(a, b) \in A \times B$.

But what do we actually mean by $(a, b)$? Mathematicians have come up with a lot of different ways of constructing ordered pairs. But we want to create a general model of an ordered pair, so we don’t want to limit ourselves to any specific construction: we want to capture the key property of what the ordered pair means.

It doesn’t matter which one we use: what matters is that there’s a key underlying property of the product: there are two functions and, called projection functions, which map elements of the product back to the elements of A and B. If $p=(a,b) \in A\times B$, then $\lambda(p) = a$ (where $\lambda$ is the name of the left projection), and $\rho(p) = b$ (where $\rho$ is the name of the right projection).

That’s going to be the key to the categorical product: it’s going to be defined primarily by the projection functions. We know that the only way we can talk about things in category theory is to use arrows. The thing that matters about a product is that it’s an object with projections to its two parts. We can describe that, in category theory, as something that we’ll call a wedge:

A wedge is basically an object, like the one in the diagram to the right, which we’ll call $A \land B$. This object has two special arrows, $l$ and $r$, that represent projections from $A\times B$ to its components in $A$ and $B$.

Now we get to the tricky part. The concept of a wedge captures the structure of what we mean by a product. But given two objects A and B, there isn’t just one wedge! In a category like Set, there are many different ways of creating objects with projections. Which object is the correct one to use for the product?

For example, I can have the set of triples $(A, B, C)$. I can easily define a left project from $(A, B, C)$ to $A$, and a right projection from $(A, B,C)$ to $B$. But clearly $(A, B, C)$ is not what we mean by the product of $A \times B$. It’s close, but it’s got extra noise attached, in the form of that third element $C$.

If, for two objects $A$ and $B$, there are many wedges with left and right projections, which one is the real product?

Just a little while ago, we talked about initial and terminal objects. A terminal object can be understood as being a rough analog to a membership relation. We’re going to use that.
We can create a category of wedges $A \land B$, where there is an arrow $m$ from $X$ to $Y$ when the diagram below commutes in our original category:

In the category of wedges, what that means is that Y is at least as strict of a wedge than X; X has some amount of noise in it (noise in the sense of the C element of the triple from the example above), and Y cannot have any more noise than that. The true categorical product will be the wedge with no excess noise: an wedge which has an arrow from every other wedge in the category of wedges.
What’s an object with an edge from every other object? It’s the terminal object. The categorical product is the terminal wedge: the unique (up to isomorphism) object which is stricter than any other wedge.
Another way of saying that, using categorical terminology, is that there is a universal property of products: products have left and right projections. The categorical product is the exemplar of that property: it is the unique object which has exactly the property that we’re looking at, without any extraneous noise. Any property that this universal object has will be shared by every other product-like object.

This diagram should look familiar: it’s the same thing as the diagram for defining arrows in the category of wedges. It’s the universal diagram: you can substitute any wedge in for C, along with its project arrows (f, g).

We can pull that definition back to our original category, and define the product without the category of wedges. So given two objects, A and B, in a category, the categorical product is defined as an object which we’ll call $A \times B$ along with two arrows and , which have the property that for any object $C$ which has arrows$f: C \rightarrow A$ and $g:C \rightarrow B$, there is a unique arrow $(f,g):C \rightarrow (A\times B)$ for which the diagram to the right commutes.

On its own, if we’re looking specifically at sets, this is just a complicated way of defining the cartesian product of two values. It doesn’t really tell us much of anything new. What makes this interesting is that it isn’t limited to the cartesian product of two sets: it’s a general concept that takes what we understand about simple sets, and expands it to define a product of any two things in categories. The set-theoretic version only works for sets: this one works for numbers, posets, topologies, or anything else.

In terms of programming, products are pretty familiar to people who write functional programs: a product is a tuple. And the definition of a tuple in a functional language is pretty much exactly what we just described as the categorical product, tweaked to make it slightly easier to use.

For example, let’s look at the product type in Scala.

trait Product extends Any with Equals {  def productElement(n: Int): Any  def productArity: Int...}

The product object intrinsically wraps projections into a single function which takes a parameter and returns the result of applying the projection. It could have been implemented more categorically as:

trait CatProduct extends Any with Equals {  def projection(n: Int): () => Any  ...}

Implemented the latter way, to extract an element from a product, you’d have to write prod.projection(i)() which is more cumbersome, but does the same thing.

More, if you look at this, and think of how you’d use the product trait, you can see how it relates to the idea of terminal objects. There are many different concrete types that you could use to implement this trait. All of them define more information about the type. But every implementation that includes the concept of product can implement the Product trait. This is exactly the relationship we discussed when we used terminal objects to derive the ideal product: there are many abstractions that include the concept of the product; the categorical product is the one that abstracts over all of them.

The categorical product, as an abstraction, may not seem terribly profound. But as we’ll see in a the next post, in category theory, we can compose abstractions – and by using composition to in a compositional way, we’ll be able to define an abstraction of exponentiation, which generalizes the programming language concept of currying.

# The Math of Vaccinations, Infection Rates, and Herd Immunity

Here in the US, we are, horribly, in the middle of a measles outbreak. And, as usual, anti-vaccine people are arguing that:

• Measles isn’t really that serious;
• Unvaccinated children have nothing to do with the outbreak; and
• More vaccinated people are being infected than unvaccinated, which shows that vaccines don’t help.

A few years back, I wrote a post about the math of vaccines; it seems like this is a good time to update it.

When it comes to vaccines, there’s two things that a lot of people don’t understand. One is herd immunity; the other is probability of infection.

Herd immunity is the fundamental concept behind vaccines.

In an ideal world, a person who’s been vaccinated against a disease would have no chance of catching it. But the real world isn’t ideal, and vaccines aren’t perfect. What a vaccine does is prime the recipient’s immune system in a way that reduces the probability that they’ll be infected.

But even if a vaccine for an illness were perfect, and everyone was vaccinated, that wouldn’t mean that it was impossible for anyone to catch the illness. There are many people who’s immune systems are compromised – people with diseases like AIDS, or people with cancer receiving chemotherapy. (Or people who’ve had the measles within the previous two years!) And that’s not considering the fact that there are people who, for legitimate medical reasons, cannot be vaccinated!

So individual immunity, provided by vaccines, isn’t enough to completely eliminate the spread of a contagious illness. To prevent outbreaks, we rely on an emergent property of a vaccinated population. If enough people are immune to the disease, then even if one person gets infected with it, the disease won’t be able to spread enough to produce a significant outbreak.

We can demonstrate this with some relatively simple math.

Let’s imagine a case of an infection disease. For illustration purposes, we’ll simplify things in way that makes the outbreak more likely to spread than reality. (So this makes herd immunity harder to attain than reality.)

• There’s a vaccine that’s 95% effective: out of every 100 people vaccinated against the disease, 95% are perfectly immune; the remaining 5% have no immunity at all.
• The disease is highly contagious: out of every 100 people who are exposed to the disease, 95% will be infected.

If everyone is immunized, but one person becomes ill with the disease, how many people do they need to expose to the disease for the disease to spread?

Keeping things simple: an outbreak, by definition, is a situation where the number of exposed people is steadily increasing. That can only happen if every sick person, on average, infects more than 1 other person with the illness. If that happens, then the rate of infection can grow exponentially, turning into an outbreak.

In our scheme here, only one out of 20 people is infectable – so, on average, if our infected person has enough contact with 20 people to pass an infection, then there’s a 95% chance that they’d pass the infection on to one other person. (19 of 20 are immune; the one remaining person has a 95% chance of getting infected). To get to an outbreak level – that is, a level where they’re probably going to infect more than one other person, they’d need expose something around 25 people (which would mean that each infected person, on average, could infect roughly 1.2 people). If they’re exposed to 20 other people on average, then on average, each infected person will infect roughly 0.9 other people – so the number of infected will decrease without turning into a significant outbreak.

But what will happen if just 5% of the population doesn’t get vaccinated? Then we’ve got 95% of the population getting vaccinated, with a 95% immunity rate – so roughly 90% of the population has vaccine immunity. Our pool of non-immune people has doubled. In our example scenario, if each person is exposed to 20 other people during their illness, then they will, on average, cause 1.8 people to get sick. And so we have a major outbreak on our hands!

This illustrates the basic idea behind herd immunity. If you can successfully make a large enough portion of the population non-infectable by a disease, then the disease can’t spread through the population, even though the population contains a large number of infectable people. When the population’s immunity rate (either through vaccine, or through prior infection) gets to be high enough that an infection can no longer spread, the population is said to have herd immunity: even individuals who can’t be immunized no longer need to worry about catching it, because the population doesn’t have the capacity to spread it around in a major outbreak.

(In reality, the effectiveness of the measles vaccine really is in the 95 percent range – actually slightly higher than that; various sources estimate it somewhere between 95 and 97 percent effective! And the success rate of the vaccine isn’t binary: 95% of people will be fully immune; the remaining 5% will have a varying degree of immunity And the infectivity of most diseases is lower than the example above. Measles (which is a highly, highly contagious disease, far more contagious than most!) is estimated to infect between 80 and 90 percent of exposed non-immune people. So if enough people are immunized, herd immunity will take hold even if more than 20 people are exposed by every sick person.)

Moving past herd immunity to my second point: there’s a paradox that some antivaccine people (including, recently, Sheryl Atkinson) use in their arguments. If you look at an outbreak of an illness that we vaccinate for, you’ll frequently find that more vaccinated people become ill than unvaccinated. And that, the antivaccine people say, shows that the vaccines don’t work, and the outbreak can’t be the fault of the unvaccinated folks.

Let’s look at the math to see the problem with that.

Let’s use the same numbers as above: 95% vaccine effectiveness, 95% contagion. In addition, let’s say that 2% of people choose to go unvaccinated.

That means thats that 98% of the population has been immunized, and 95% of them are immune. So now 92% of the population has immunity.

If each infected person has contact with 20 other people, then we can expect expect 8% of those 20 to be infectable – or 1.6; and of those, 95% will become ill – or 1.52. So on average, each sick person will infect 1 1/2 other people. That’s enough to cause a significant outbreak. Without the non-immunized people, the infection rate is less than 1 – not enough to cause an outbreak.

The non-immunized population reduced the herd immunity enough to cause an outbreak.

Within the population, how many immunized versus non-immunized people will get sick?

Out of every 100 people, there are 5 who got vaccinated, but aren’t immune. Out of that same 100 people, there are 2 (2% of 100) that didn’t get vaccinated. If every non-immune person is equally likely to become ill, then we’d expect that in 100 cases of the disease, about 70 of them to be vaccinated, and 30 unvaccinated.

The vaccinated population is much, much larger – 50 times larger! – than the unvaccinated.
Since that population is so much larger, we’d expect more vaccinated people to become ill, even though it’s the smaller unvaccinated group that broke the herd immunity!

The easiest way to see that is to take those numbers, and normalize them into probabilities – that is, figure out, within the pool of all vaccinated people, what their likelihood of getting ill after exposure is, and compare that to the likelihood of a non-vaccinated person becoming ill after exposure.

So, let’s start with the vaccinated people. Let’s say that we’re looking at a population of 10,000 people total. 98% were vaccinated; 2% were not.

• The total pool of vaccinated people is 9800, and the total pool of unvaccinated is 200.
• Of the 9800 who were vaccinated, 95% of them are immune, leaving 5% who are not – so
490 infectable people.
• Of the 200 people who weren’t vaccinated, all of them are infectable.
• If everyone is exposed to the illness, then we would expect about 466 of the vaccinated, and 190 of the unvaccinated to become ill.

So more than twice the number of vaccinated people became ill. But:

• The odds of a vaccinated person becoming ill are 466/9800, or about 1 out of every 21
people.
• The odds of an unvaccinated person becoming ill are 190/200 or 19 out of every 20 people! (Note: there was originally a typo in this line, which was corrected after it was pointed out in the comments.)

The numbers can, if you look at them without considering the context, appear to be deceiving. The population of vaccinated people is so much larger than the population of unvaccinated that the total number of infected can give the wrong impression. But the facts are very clear: vaccination drastically reduces an individuals chance of getting ill; and vaccinating the entire population dramatically reduces the chances of an outbreak.

The reality of vaccines is pretty simple.

• Vaccines are highly effective.
• The diseases that vaccines prevent are not benign.
• Vaccines are really, really safe. None of the horror stories told by anti-vaccine people have any basis in fact. Vaccines don’t damage your immune system, they don’t cause autism, and they don’t cause cancer.
• Not vaccinating your children (or yourself!) doesn’t just put you at risk for illness; it dramatically increases the chances of other people becoming ill. Even when more vaccinated people than unvaccinated become ill, that’s largely caused by the unvaccinated population.

In short: everyone who is healthy enough to be vaccinated should get vaccinated. If you don’t, you’re a despicable free-riding asshole who’s deliberately choosing to put not just yourself but other people at risk.