So, what’s set theory really about?
We’ll start off, for intuition’s sake, by talking a little bit about what’s now called naive set theory, before moving into the formality of axiomatic set theory. Most of this post might be a bit boring for a lot of you, but it’s worth
being a bit on the pedantic side to make sure that we’re starting from a clear basis.
A set is a collection of things. What it means to be a member of a set S is
that there’s some predicate P_{S} – that is, some way of describing things via logic – which is true only for members of S. To be a tad more formal, that means that for any possible object x, P_{S}(x) is true if and only if
x is a member of S. In general, I’ll just write S for both the set and the predicate that defines the set, unless there’s some reason that that would be confusing and/or ambiguous. (Another way of saying that is that a set S is a collection of things that all share
some property, which is the defining property of the set. When you work through
the formality of what a property means, that’s just another way of saying that there’s a
predicate.)
Set theory – as you can see even from the first definition above – is incredibly closely intertwined with first order predicate logic (FOPL). In general, the two can form a nicely closed formal system: sets provide a semantic basis for the logic; and the logic provides a set of tools for talking about sets. That’s a big part of why set theory makes such a good basis for mathematics: it’s one of the simplest things that we can use to create a semantically meaningful complete logic.
So I’m going to run through a quick reminder of the basic notations and concepts of FOPL; for more details, you can look at some posts about FOPL itself here.
In first order predicate logic, we talk about two kinds of things: predicates, and objects. Objects are the things that we can reason about using the logic; predicates are the things that we use to reason about objects.
A predicate is a statement that says something about some object or objects. We’ll write predicates as either uppercase letters (“A”, “B”), or words starting with an uppercase letter (“Married”), and we’ll write objects in quotes. Every predicate is followed by a list of commaseparated objects (or variables representing objects). So, for a few examples:

Dog("spot")
is the FOPL form of the statement “spot is a dog”. 
WorksFor("mark", "google")
is the FOPL form of the statement
“Mark works for Google”. 
Remainder("27","5","2")
is the FOPL form of the statement
“2 is the remainder of dividing 27 by 5”. In general, when we’re using numbers
as objects, we’ll omit the quotes; so we could also write thisRemainder(27,5,2)
.
One very important restriction is that predicates are not objects. That’s why this is called first order predicate logic: you can’t use a predicate to make
a statement about another predicate. So you can’t say something like “Transitive(GreaterThan)
“: that’s a secondorder statement, which isn’t allowed in first order logic.
We can join logical statements together to form more complicated statements: if α
and β are both valid FOPL statements, then α∧β (α and
β) α∨β (α or β), α⇒β (α
implies β, or if α then β), and α⇔β (α if
and only if β). We can also negate statements: ¬α (not
α).
Finally, we have two quantifiers which we can use to create general statements
using variables. We’ll generally write variables as lowercase, nonquoted letters or words.
The two quantifiers are ∀ (for all), and ∃ (there exists).
We can use the quantifiers to make statements like
∀x,y,z:GreaterThan(x,y)∧GreaterThan(y,z)⇒GreaterThan(x,z)
(for
all x, y, and z, if x is greater than y, and y is greater than z, then x is greater than
z.)
The basics of set theory give us a small number of simple things that we can say about sets and their members. These also provide a basic set of primitive statements for
our FOPL:

x∈S
: the object x is a member of S; also the predicateS(x)
is true. 
S⊆T
: S is a subset of T, meaning ∀x, x∈S &implies; x∈T.

S⊂T
: S is a proper subset of T, meaning that
S⊆T ∧ (¬T⊆S)
. 
S = T
: S⊆T ∧ T⊆S
One interesting thing to note about those definitions is that only "∈" is really a primitive: the others are all defined in FOPL in terms of "∈".
There are also a bunch of primitive operations on sets  that is, mappings
from pairs of sets to other sets:

A∪B
: The union of A and B. x∈A∪B⇔x∈A∨x∈B

A∩B
: The intersection of A and B. x∈A∩B⇔x∈A∧x∈B
(Thanks to Nat Whilk for catching a typo here.)

AB
: set difference: x∈AB if and only if x∈A ∧ x∉B.
AΔB
: symmetric difference: x∈AΔB if and only if x∈A∧x∉B ∨ x∉A∧x∈B (Errors corrected after being pointed out in comments.)
Finally, within the most basic set operations, there's a fundamental one called
cartesian product. The cartesian product of two sets S and T, written S×T
consists of a set of objects, containing exactly one unique object for every possible pair
or one element from A and one element from B. If a∈A and b∈B, then we'll call the
object in A×B that pairs a and b (a,b)
. Given any two objects a∈A
and b∈B, we can find the object (a,b)∈A×B, and given any object
z∈A×B, we can find the unique objects a∈A and b∈B such that
(a,b)=z.
Within this, we can talk about relations: given two sets A and B, a relation R is a subset of A×B. We'll often write that R(a)=b if (a,b)∈R. A function is a relation where for every object a∈A, there is at most
one object b∈B such that R(a)=b.
There are a number of special kinds of functions, some of which are really important. Given a function F from set A to set B:
 If for every a∈B, there is exactly one b∈B such that F(a)=b, then
F is a total function.
 If for every b∈B, there is exactly one a∈A : F(a)=b, then
F is called a surjection or onto function.
 If F is both total and onto, then F is called an isomorphism.
Two sets A and B have the same size or cardinality if and only if there exists an isomorphism between A and B. In some sense, any two sets with an isomorphism can be considered to be equivalent sets.
You mean “bijection”, not “isomorphism”.
And a “total” function is also called an “injection” or an “into” function.
The word “interaction” should be “intersection”.
The symbol you’re using for symmetric difference is nonstandard and different from the symbol you used in your similar post “Basics: Sets” from earlier this year.
Anonymous: In the category of sets bijections are isomorphisms.
Mark, I think your definitions of “total” and “surjective” are broken, but in different ways. Anon#1, I don’t think what Mark meant to give as the definition of “total” is equivalent to “injective”, but I can see how you got that impression.
In my world …
0. A relation on AxB that never contains both (a,b1) and (a,b2) is a “partial function from A to B”.
(Mark used the term “function”. Any partial function is a function, but I wouldn’t say it’s “from A” unless it’s total on A.)
1. A (partial) function from A to B is “total” if for all a in A there’s at least one b in B with F(a)=b.
(Mark said “for all a in B”, obviously a typo; he said “exactly one b”, which is in fact equivalent to “at least one b” in view of the definition of a function.)
2. A function from A to B is “onto” or “surjective” if for all b in B there’s at least one a in A with F(a)=b.
(Mark said “exactly one”. With the presupposition of totality that I’m making, that would be the definition of a bijection.)
3. A function from A to B is “onetoone” or “injective” if for all b in B there’s at most one a in A with F(a)=b.
(Mark didn’t mention this one, having sort of folded it into his definition of “onto”/”surjective”. Anon#1 calls these “into”, which I don’t much care for.)
4. A total function that’s both injective and surjective — i.e., for each a there’s exactly one b, and for each b there’s exactly one b, with F(a)=b — is “bijective”; such a function is called a “bijection”, and (pace Anon#1) it’s perfectly reasonable, albeit slightly eccentric, to call it an “isomorphism”. (An isomorphism in the category of sets, as aficionados of abstract nonsense would put it. They’d also say “epimorphism” instead of “surjection” and “monomorphism” instead of “injection”. But then, they also think “What’s inside a coconut? A cocokernel” is hilariously funny, so screw ’em.)
Has Mark changed his notation for symmetric difference since Nat posted? What he’s using now looks standard to me.
Operations like union and intersection aren’t really primitive, precisely because they can be given definitions (expressible in FOPL) like the ones Mark has given. (That isn’t true about the cartesian product, though it’s customary to implement that in terms of *another* primitive operation.)
Oops: I meant “and for each b there’s exactly one a” under #4. And, er, I hope John Armstrong doesn’t take my remarks about aficionados of abstract nonsense too seriously. I wouldn’t want him to come after me with a pointed stick. (A pointed stick is a stick equipped with an arrow from its terminal object. Scary.)
g:
Yes, I fixed the syntax glitch that Nat pointed out – he was right.
WRT your other comments, I’ll get to them sometime tomorrow. This has been a horrible day, and if I try to correct anything now, I’ll likely screw it up even more.
In your list of set operators you have:
4. AΔB: symmetric difference: x∈AΔB if and only if x∈A∧x∉B ∨ x∉A∧x∈A
As I read that, that’s saying a symmetric difference exists if and only if x is in A and x is not in B OR x is not in A and x is in A – a meaningless statement? It would make more sense as:
4. AΔB: symmetric difference: x∈AΔB if and only if x∈A∧x∉B ∨ x∉A∧x∈B
(the last A changed to a B.)
Mark, I wanted to thank you for an excellent post.
I’ve never looked at set theory before, but your clear and concise introduction has already helped me see the mathematics I have already learned in a new light. For instance, in “basic” math, an isomorphism would appear to be a continuous function with an inverse (even more basic: passes the horizontal and vertical line tests.)
I’ll never look at functions the same way again.
Mark, I’m sorry to hear you’ve had a horrible day. I hope tomorrow goes much better.
Shalev,
You’re right about the inverse, but not about “continuous”. There can be gaps and jumps. For example the function that maps every real number that’s the same distance to the next highest integer, but in the other direction, so 0 > 1, 0.25 > 0.75, 0.5 > 0.5, 0.75 > 0.25, 1 > 2, 1.25 > 1.75, 1.5 > 1.5, and so on. It’s a bijection (invertible), but not continuous.
There are some other even *more* pathological examples out there – functions that are bijections but aren’t continuous over *any* interval.
Wouldn’t 0 map to 2 in that definition?
Mark, let’s hope tomorrow is a better day.
Thanks for another needed basics post, polishing up some things that should be obvious, but apparently isn’t. For myself, I thought quantifiers where outside FOPL, since they speak of properties of predicates (“for at least one/all object(s) of this predicate holds”), albeit indirectly. So the “predicates aren’t objects” (no recursivity, do’h!) was clarifying.
Apparently everyone but me gets the definition of “proper subset”. My difficulty is your inclusion of objects x ∈ T and x ∉ S. That is btw the Wikipedia definition: “If A is a proper subset of B, then there exists at least one element x of B which is not an element of A.” ( http://en.wikipedia.org/wiki/Subset ))
That would be done by ¬T⊆S. Since (¬T)⊆S doesn’t seem to be fulfilled by any objects x ∉ T while x ∈ T, I assume that it is really ¬(T⊆S). So we have not that for ∀x: x ∈ T ⇒ x ∈ S. So presumably we have ∃x: (x ∈ T) ∧ (x ∉ S), the very objects we wish for.
Is that it? Should the last expression in definition of proper subset be ¬(T⊆S)? Or did I mess up somewhere?
Torbjön:
“¬” is a logical operation applied to a predicate, not an operation on a set. So “¬S” is not a meaningful statement – “¬” doesn’t apply to sets. On the other hand, “S⊆T” is a predicate statement. So “¬S⊆T” means “¬(S⊆T)”.
I never could get the difference between first order predicate logic and second order. And that’s after reading the Wikipaedia articles on the topic. But your single example (Transitive(GreaterThan)) turned on the lights for me.Thanks!
g: Thanks for the pointedstick definition. I’ll have to use that one in the future.
g: Thanks for the pointedstick definition. I’ll have to use that one in the future.
g: Thanks for the pointedstick definition. I’ll have to use that one in the future.
whoops.. mental note: do not try to scroll through the page after hitting “post” until the page reloads…
OK. I assumed that it was your shorthand for the set complement, from the negation of the predicate.
Good post. I was under the mistaken impression that technically a function was any relation (but that the common usage was the actual definition and a relation was something else.)
One error. The link to the FOPL post actually just links back to this page.
g wrote:
Actually George Boole who invented naive set theory did exactly the opposite. He took his algebra “naive set theory” with union and intersection as primitive inorder to deduce his attempt at a symbolic logic, basically FOPL without quantification, from it.
an excellent post. through an entire math major i had never really understood functions as a set of ordered pairs. it took a graduate degree in linguistics to make me see functions that way.
but there’s another typo. in your definition of a total function, i think you want a to be an element of set A, not B.
I’m having trouble seeing your symbols (for conjunction, disjunction, etc.); they are showing up as undifferentiated squares. My guess is that my IE settings are lacking. Any suggestions about how to fix this?
mgarelick:
The only advice I can give is try firefox or opera. The math symbols that I’m using are part of the HTML extended character set, which *should* be supported by any modern browser.
The computers that I have access to for testing are either MacOS or Linux; and I’ve tested the math characters on both using firefox, safari, opera, camino, and konqueror.
Everything’s fine in firefox. I’m excited about your set theory posts, because I just struggled through the first part of Principles of Mathematics, with just enough understanding to know there’s something there I’m not understanding.
Poor Mark, just when you thought you posted some easy basics, the typing gremlin strikes hard. But my question is about your essentialism:
As I recall, the elements of S need not have anything in common except being elements of S. Is epsilon (∈) a first order predicate?