Cantor showed in effect that for every set *X*, the power set
℘*X* consisting of all subsets of
*X* is a strictly bigger set. It is easy to exhibit an
injection (one-to-one function) from *X* to ℘X: just map each *x* in *X*
to the singleton {x}. What Cantor showed is that there cannot exist an
injection *f*: ℘X →
*X* in the other direction: every *f* of this form must
send two subsets of *X* to the same element of *X*.

One version of Cantor's proof goes as follows. Suppose to the
contrary that *f*: ℘X
→ *X* is an injection. Let *Z*
= {*f*(*Y*)| *f*(*Y*) ∉
*Y*}. Then *f*(*Z*) ∉ *Z* if and
only if *f*(*Z*) ∈ *Z*, a contradiction,
whence no such injection can exist. (Injectivity of *f*
is needed for the "if" direction: without injectivity we could
have *f*(*Z*) ∈ *Z* on account of some
*Y* for which *f*(*Y*) = *f*(*Z*)
but *f*(*Y*) ∉ *Y*.)

A set is called *countable* when it can be put in bijection with
a set of natural numbers. In particular the set ℕ of all natural
numbers is itself countable, being in bijection with itself via the
identity function. Cantor's theorem shows that the set ℘ℕ of all sets of natural numbers cannot be
put in bijection with any set of natural numbers or we would have an
injection from ℘ℕ to ℕ. So
℘ℕ is uncountable.

Cantor's proof can be carried out in Zermelo's formalization of set theory. This formalization is expressed using variables, quantifiers, logical connectives (AND, OR, NOT), and parentheses, all of which are assumed to have their standard meanings, along with a binary relation symbol ∈ for set membership. This last symbol is the only one whose interpretation is constrained by Zermelo's axioms

A model of a theory such as Zermelo's consists of a set of elements and
an interpretation of each nonlogical symbol in the language of the
theory. Since ∈ is the only nonlogical symbol in Zermelo's
axiomatization, it follows that a *model* of Zermelo's axioms is
a set *U* and a binary relation *E* on *U*, that
is, a subset of *U*×*U*. *U* is the
universe of sets in the model. *E* constitutes the
interpretation of the symbol ∈ in this model. We can picture
*E* as a square (*U*×*U*) matrix of 0's and
1's. For any two elements *X*,*Y* of *U*, there
is a 1 in row *X* and column *Y* of the matrix *E*
just when *X* is a member of *Y*.

System **Z**, Zermelo's axioms, imposes the following constraints
on *E*.

1. *Extensionality*. If two sets have the same elements then they
are the same set. In terms of the matrix *E* this says that
all columns are distinct.

2. *Pairing*. For all *X*,*Y* there exists
{*X*,*Y*}. This says that *E* must contain every
possible column having one or two 1's (the case of a single 1 arises
when *X* = *Y*).

3. *Comprehension*. This says that for every set *X*
and every first order predicate expressible in the language of
Zermelo's axioms there exists a set consisting of those elements of
*X* satisfying that predicate. That is, given a column (not
necessarily in *E*) definable by some predicate on *U*,
its intersection with any column of *E* must be a column of
*E*. In particular the false predicate defines the column of
all zeros, the intersection of which with any column is zero. This
corresponds to the empty set, which therefore belongs to *U*.

4. *Union*. For any set *X* there exists a set
∪*X* consisting of the members of the members of *X*.
That is, *E* must include all the columns of the relation
*E*², the composition of *E* with itself. It then
follows from pairing that the columns of *E* are closed under
binary disjunction: X∪Y = ∪{X,Y}. It further follows that
every possible column with a finite number of 1's must be present in
*E*, with comprehension supplying the empty set.

5. *Power set*. For all *X* there exists ℘X such that for all *Y* in
*U*, *Y* ⊆ *X* if and only if *Y*
∈ ℘X. In *E*,
℘X is that column with a 1 in
row *Y*, for any *Y* in *U*, just when column
*X* has a 1 in every row that column *Y* does.

6. *Foundation*. Any nonempty set *X* must contain an element
*Y* containing no elements of *X*. It follows that no
set can be a member of itself. For if *Y* is such a set then
the set *X* = {*Y*} is a nonempty set every element
of which contains an element of {*Y*}, since {*Y*}
has just the one element *Y* which contains *Y* as
a member. In terms of *E*, the diagonal of *E*, those
entries at row *X* and column *X* for all *X*
in *U*, must be zero. It follows that there cannot exist a
column of all ones, that is, *U* cannot be a set since that
would make it a member of itself.

7. *Infinity*. There is a set ℕ containing the empty set
0 and closed under successor *s*(*X*) = *X*
∪ {*X*}. By foundation the sets 0,s(0),s(s(0)),…
must all be distinct. Hence ℕ must include all these sets,
and therefore must be infinite. In terms of the matrix *E*,
column ℕ has infinitely many ones.

These seven axioms constitute Zermelo's axiomatization of set theory,
which is therefore denoted **Z**. There is an eighth axiom called
Replacement due to Fraenkel, defining **ZF** for Zermelo-Fraenkel.
Adding a further axiom called Choice then defines **ZFC**,
Zermelo-Fraenkel with Choice. However neither of these axioms change
the nature of Skolem's paradox and we can therefore ignore them and
stick to axiom system **Z**.

Now these axioms are all expressed in first order logic. While it is
an open question whether **ZF** has a model, it is straightforward
to produce a model of **Z**. What can such a model look like?

The Löwenheim-Skolem theorem concerns models of first-order or elementary theories. In modern language it says that any structure with a countable language has a countable elementary substructure. In this context "elementary" means that the substructure satisfies every first-order sentence (in the specified language) that holds of the structure.

Leopold Löwenheim first proved the theorem in 1915. In 1920
Thoralf Skolem greatly simplified Löwenheim's proof by using
what we now call *Skolem functions*. In a sentence such as
∀*x*∃*y*&phi(*x*,*y*),
noting that the choice of *y* depends on *x*,
we can make the choice concrete by rephrasing the sentence as
∀*x*&phi(*x*,*f*(*x*)) where
*f* is the Skolem function that chooses *y* appropriately
given *x*. Given any consistent first-order theory expressed
in a countable language, Skolem showed how to construct a countable
model of that theory from the countably many terms expressible in
the language augmented with these Skolem functions.

Since **Z** has a model, and its language is not only countable but
finite, having only the one nonlogical symbol ∈, it follows from
the Löwenheim-Skolem theorem that **Z** has a countable model.

Skolem's paradox asks, how can a theory such as **Z** that proves
the existence of an uncountable set have a countable model?

Although the axioms of set theory pin down exactly much of the
behavior of sets, there is one intuitively obvious property that
is missing, namely that of closure under taking subsets. In any
model (*U*,*E*) of **Z** it is natural to expect
that any column obtainable from a column of *E* by changing
some of its ones to zeros would itself be a column of *E*.
However there is no way of saying this. Furthermore if there were it
would contradict the Lowenheim-Skolem theorem, since if *E*
contained a column with infinitely many ones it would then have to
have uncountably many columns, making the model uncountable.

It is these missing columns that allow Skolem's paradox. In the
external view of a countable model of **Z**, both ℕ
and ℘ℕ must be countable.
Hence there *does* exist an injection *f*: ℘ℕ → ℕ, visible to us
external observers. We can then use that injection as in Cantor's
proof to construct the set *Z* = {*f*(*Y*)|
*f*(*Y*) ∉ *Y*} as a subset of ℕ.
The proper conclusion at this point is not the contradiction Cantor's
argument depends on, but instead that *Z* is one of the
missing columns.

Does this undermine Cantor's argument? Well, this is a very nice question for philosophers, who might want to argue against first order logic on this ground. Why should we trust theorems when for all we know they are referring to a defective model, or at least a different model from the one we had in mind?

Here are two competing resolutions of Skolem's paradox, both of which preserve the soundness of Cantor's argument. The first is a commonly suggested syntactic solution, the second a somewhat less commonly suggested semantic one, the essential feature of which is that the number of sets in the model can vary but the sets themselves cannot vary with respect to either their members or their subsets.

*The Logical Scenario* One resolution of Skolem's
paradox starts from the observation that Skolem's observation only
becomes paradoxical when we mix concepts from different models.
Our claim above that ℘ℕ and ℕ
are countable in a countable model of **Z** is made in another model
of set theory, one in which the universe *U* of the countable
model is assumed to be a set in order to talk about it as a model.
Since *U* is provably not a set in the countable model,
already we have an inconsistency between these two models of set
theory, so it is not surprising to find other inconsistencies such
as that implied by Skolem's paradox.

In particular the injection we found from ℘ℕ to ℕ was an object of the external model. The proper interpretation of Cantor's theorem is that, in any model of set theory, there can be no such injection. And indeed when we examine all the sets of the countable model, none of them represent an injection from ℘ℕ to ℕ. Invoking a function from the external view of the model is like using "pain" to mean "hurt" in France.

In the logical or syntactic scenario, the only sets that come up
for discussion are those namable in **Z**. These have certain
properties that are provable in **Z**. In particular each of
those sets arising somehow as an infinite power set is uncountable.
Those properties do not change from model to model. What *can*
change between models is the number of elements of a particular named
set from the point of view of an observer of the model, who may find
uncountably many such elements in one interpretation of that name
and countably many in another. No significance is attached to these
external interpretations however.

*The Platonic Scenario* A competing resolution is
to regard these countable models of **Z** as clearly nonstandard
and not what we had mind by the axioms of set theory. Rather we were
thinking of a model that is closed under taking subsets. That is,
any column obtainable from a column of *E* by replacing
some of its ones by zeros must itself be a column of *E*.
Call such a model *standard*.

Unfortunately there is nothing we can say in the language of **Z**
that forces this degree of standardness. If we say "for all *Y*
⊆ *X*" the *Y*'s that we quantify over are merely
those already in the model: pointing at them cannot make the missing
ones appear. If we try to manufacture subsets of *X* using the
axiom of comprehension to delete elements from *X*, there are
only countably many subsets we can produce in that way, since **Z**
has only countably many predicates.

What we can do however is step gingerly outside the bounds of first
order logic and insist on interpreting **Z** only with models that
are standard in this sense of being closed under taking subsets. The
internal and external notions of ℘X
then coincide: every subset of *X* visible to an external
observer arises as an element of *U*. Hence if column
℘X has infinitely many ones it has
uncountably many ones. The fact that there exist nonstandard models
of **Z** in which all sets are (externally) countable is then just
a curiosity of our first order axiomatization of set theory.

This requirement of standardness makes the internal sets behave more
like the external sets without however making them exactly the same.
In particular the universe *U* of any model of **Z**,
standard or not, is a set in the external view of the model but not
an internal set of the model (the axiom of foundation prevents any
column from containing all ones, by forcing the diagonal of *E*
to be zero). We can account for this by regarding our external view of
any model of **Z** as being defined in a larger model of **Z**,
one that has additional larger sets including the set of all elements
of the model. We can assume that both universes are standard.

This notion of standard model provides a nice middle ground which bars the most egregiously small models of set theory while on the other hand not requiring them to be so large as to run any risk of inconsistency.

But wait. What if the external view of set theory by which we judge
the standardness of models of **Z** is itself not standard?
Could there be a yet more external view that reveals previously
unrecognized subsets of some set *X*? Somehow we'd like our
notion of standardness to rule this out: we want every subset of
*X* that could possibly exist.

So our notion of standardness should come with some guarantee of maximality. No external view of a standard model should reveal subsets missing from the supposedly standard model.

And so on. One can hold forth at much greater length about Skolem's paradox. Timothy Bays's Ph.D. thesis Reflections on Skolem's Paradox in particular brings more than a hundred pages of high-powered model theory to bear on the topic, at a much higher level than we have engaged here. This is tightened up to 32 pages in its published version The Mathematics of Skolem's Paradox.

Also on the web is an extract from A.W. Moore's article Set Theory, Skolem's Paradox and the Tractatus casting the issue in terms of relative vs. absolute uncountability and the ill-definedness of the scope of the concept of set. It seems to me that Moore, like Bays, turns Skolem's paradox into too complicated a story. Both my scenarios, the logical and the Platonic, show how one can have one's cake and eat it too, by taking countability to be an absolute concept without however committing oneself to a fixed scope for the concept of set.

To vary the scope in the logical scenario, allow any model of **Z**
but always interpret uncountability internally. Since the properties
of namable sets are established purely logically, independently of
the choice of model, varying the model cannot change those properties.

To vary the scope in the Platonic scenario, by all means bring in new sets, but in doing so don't add any new elements or subsets to already established sets. In this way varying the scope of set theory cannot vary the countability or otherwise of any one set.

The point of the present essay was not so much to suggest one option
over another as to expose the underlying logic of Skolem's paradox
by showing in as simple a way as possible how **Z** can entail
the existence of infinite sets while still having countable models,
and to point out two simple resolutions of the paradox, the logical
scenario and the Platonic scenario. As to what else can or should
be done about the paradox, let a hundred theses bloom.

Vaughan Pratt