Some time ago I became interested in group theory for two different reasons: first, because of the role it plays in defining the unique properties of some common sets of numbers in mathematics, and second, because of the role it plays in physics. The following is an attempt to describe the train of thought involved in the first topic (the unique properties of some important number sets), but I end up with some comments on the second as well. My presentation here is oversimplified and non-rigorous; any readers who already know about this stuff will probably think I'm stating the obvious and doing it pretty clumsily. If you're interested in a more in-depth discussion of what I'm only scratching the surface of here, try this site; the "fundamentals" section talks about the main sets I discuss in this article but with more mathematical background and rigor.

The concept of a "group" will seem pretty abstract when we look
at its precise mathematical definition below, but the idea behind
it is actually quite simple. Mathematics has to deal with a wide
variety of objects and sets of objects, and it's useful to
make this as easy as possible by coming up with ways in which
different objects and sets of objects are similar, and then
abstracting out the similarity and making it into some type of
logical system. For example, the set of natural numbers, N,
or {0, 1, 2, 3, ...} is an abstract structure that was
invented to capture the similarities between all the different
sets of things that can be counted. Once we have this set, and
understand the logic underlying its abstract structure, we can
deduce propositions that will apply to *anything* that
can be described by that abstract structure; we don't have
to know what kind of thing it is, or even whether it actually
exists. The same goes for all of the abstract structures studied
in mathematics. (This is why Bertrand Russell once
described mathematics as "the subject in which we do not know what
we are talking about, nor whether what we are saying is true.")

Once you've been doing mathematics for a while, you find that there are lots of these different abstract sets lying around, and sooner or later you realize that it's time to take this process of abstraction to the next level. That is, you start to realize that there are abstract similarities between different abstract structures, and you start studying the logic underlying the abstract structure that is common to all these different abstract structures. (Whew!) Group theory is an example of this "second-order" study of abstract structures; it's a way of axiomatizing properties that many important abstract structures (such as, for example, the set of integers Z, the set of rational numbers Q, the set of real numbers R, and the set of complex numbers C) have in common, so that they can be studied independently of the specific sets themselves.

Historically, group theory arose in the 19th century in the course of studying the roots of polynomial equations of various degrees. The particular question at the time was whether there could be a general procedure for finding roots of polynomials of the fifth degree. You probably remember from algebra class the general procedure for finding the roots of quadratic (second degree) polynomials; it is the quadratic formula:

*x* = ( -*b* +/- √(*b*^{2}
- 4*ac*) ) / 2*a*

where the coefficients come from the general quadratic
polynomial, *ax*^{2} + *bx* + *c* = 0.

It turns out that this kind of thing also works with cubic and quartic (third and fourth degree) polynomials, but it doesn't work for polynomials of 5th degree and higher. The reason turns out to be connected to the symmetry properties that the set of solutions has to have: in other words, the solutions form a group, and it can be shown that it's a group that can't possibly be the same as the group of symmetries of quantities that can be constructed using the operations that are used to construct the formulas for solving equations of degrees lower than the fifth. (If you're interested, there's a good discussion that gives more detail--and by someone who's an expert in the field, unlike me--on John Baez's web site.)

For a century or so after it was invented, group theory seemed to be one of those pieces of higher mathematics that is beautiful, elegant, and totally unrelated to the real world. (How often does the average engineer need to find solutions to fifth degree or higher polynomials?) However, in the twentieth century, with the development of relativity and quantum mechanics, it turned out that there were various symmetry groups that could be found in nature itself, and understanding them was of crucial importance for the advance of physics. This is often cited as an example of what the physicist Eugene Wigner called "the unreasonable effectiveness of mathematics" in science--some abstract structure that mathematicians discover in the course of investigating some esoteric mathematical problem turns out to be intimately involved in physics.

For now, we are going to stick to the basics, and look at the fundamental axioms and properties of groups. Then we'll take a look at the sets I mentioned above, and see how their properties exemplify various more specific types of groups. (At the end, for "extra credit", I'm going to return briefly to the topic of symmetry groups that are observed in nature.)

**Definition**. A *group* is a set G and a binary operation *: G x G acting
on members of G (usually called *elements* of the group), which together satisfy
the following (these are often called the *group axioms*):

(1) **Closure**. G is *closed* under *, that is: if *a* * *b* = *c*
and *a*, *b* are elements of G (we shall henceforth write "in G" for this),
then *c* is in G.

(2) **Associativity**. The * operation is *associative* over G, that is: if *a*,
*b*, and *c* are in G, (*a* * *b*) * *c* = *a* *
(*b* * *c*).

(3) **Identity**. There is an element *e* of G, called the *identity*,
such that, for all *a* in G, *a* * *e* = *a*.

(4) **Inverse**. Every element *a* of G has an *inverse*, denoted
*a*^{-1}, which is also in G and satisfies
*a* * *a*^{-1} = *e*.

**Notation**. While the operation * may be anything which applies to the elements of
the set G, the two most common operations are addition (customarily written
*a* + *b* for *a*, *b* in G) and multiplication (customarily written
*ab*). Note that the group operation, whether it is addition, multiplication, or
something else, is not necessarily commutative, so the order in which *a*, *b*
are written matters. Note also in this connection that I have stated the axioms above
with only one ordering. This really only makes a difference with the last two; technically,
the axioms only specify that *e* is a "right identity" (because it's on the right in the
axiom) and that *a*^{-1} is a "right inverse" (for the same reason). It can be
shown from the axioms as I have given them that *e* is also a left identity and that
*a*^{-1} is also a left inverse, even in a group where the group operation
is not always commutative, but I won't do it here. Groups whose operation is always
commutative (i.e.,
*a* * *b* = *b* * *a* for all *a*, *b* in G) are called
*abelian* groups.

**Finite Groups**. A group is *finite* if its underlying set G has a finite
number of elements. The number of elements is called the *order* of G, denoted [G].
We won't be looking much at finite groups in this article since the sets we'll be focusing on
(the integers, the reals, etc.) are all infinite sets, but I wanted to mention them because
I'll give a couple of examples of them below (just to show that the ideas of group theory
don't just apply to infinite sets).

**Scalar Multiplication**. For any additive abelian group G (i.e., the group
operation is addition, and it is commutative, *a* + *b* = *b* + *a*
for all *a*, *b* in G), another operation can be constructed as follows:
*a* + *a* + ... (*n* times, where *n* is an integer) is called
*scalar multiplication* of *a* by *n*, and is written *na*. Note
that this is *not* the same as "group multiplication", the multiplication of one
element of a group by another, which we will discuss next.

**Rings**. A *ring* is an additive abelian group which also has a
multiplication operation defined that satisfies (1), (2), and (3) above, but not
necessarily (4), and which is distributive over addition. That is, the following hold
for a ring R:

(1) R is closed under addition (if *a* + *b* = *c* and *a*,
*b* are in R, *c* is in R);

(2) Addition is associative over R;

(3) R has an *additive identity*, 0, such that *a* + 0 = *a* for
all *a* in R;

(4) Every *a* in R has an *additive inverse* in R, *-a*, such
that *a* + *-a* = 0;

(5) **Commutativity of Addition**. For all *a*, *b* in R,
*a* + *b* = *b* + *a*;

(6) R is closed under multiplication (if *ab* = *c* and *a*,
*b* are in R, *c* is in R);

(7) Multiplication is associative over R;

(8) R has a *multiplicative identity*, 1, such that *a*1 = *a* for
all *a* in R;

(9) **Distributivity of Multiplication over Addition**. For all *a*,
*b*, *c* in R, (*a* + *b*)*c* = *ac* + *bc*, and
also *a*(*b* + *c*) = *ab* + *ac*.

(Food for thought: if every element *a* has to have an additive inverse *-a*,
what is the additive inverse of 0?)

Technically, the above axioms say that R is an abelian group with respect to addition,
but it is only a *monoid* with respect to multiplication. (Note: Some definitions
I've seen only have a ring obeying the first two group axioms for multiplication, not the
first three--under this definition rings have no multiplicative identity. A ring that also
has a multiplicative identity is then called a "unit ring". This would make a non-unit ring
only a "semi-group" with respect to multiplication; lack of an identity element is what
differentiates a semi-group from a monoid. Since all of the
rings we'll be looking at here do have a multiplicative identity, I'm going to ignore
this technicality for the rest of this article.)

The best-known example of a ring is, of course, the set of integers, usually denoted
by Z. In fact, since multiplication is also commutative over Z, Z is an *abelian*
or *commutative* ring. There are also, of course, non-commutative rings; for example,
the set of all *n* x *n* matrices with integer entries for a given *n*,
with the normal matrix addition and multiplication operations.

(Food for thought: what about the set N of natural numbers? Is it a ring? Is it a group?)

There are also many finite subsets of the integers which are rings: in fact, the set of
integers modulo *n*, where *n* is any integer greater than 1, form a ring.
(The technical name
for this type of set is Z_{n}.) For example, consider Z_{3} = {0, 1, 2}; it's
closed under addition and multiplication (because 1 + 2 = 0, since 3 is 0 modulo 3, and 2 + 2
is 1--multiplication should be obvious because 2 x 2 is the same as 2 + 2), and you can see
that all the other ring axioms are
also satisfied. (We'll look at Z_{3} again in a moment because in fact it also
satisfies a couple of other important axioms, which we'll see next.)

**Division Rings**. A special type of ring is the *division ring*, which is a
ring that also has a multiplicative inverse. That is, a division ring R, in addition to
satisfying (1) through (9) above, also satisfies the following:

(10) Every *a* in R\0 (the set of all elements of R except 0) has a
*multiplicative inverse* in R\0, *a*^{-1}, such that
*aa*^{-1} = 1.

(Food for thought: what is the multiplicative inverse of 1? Should be easy if you answered the analogous question for addition above.)

Note that the integers Z are *not* a division ring (why not?). To get a division
ring, we must *extend* the underlying set so that each element has a multiplicative
inverse satisfying (10). Equivalently, we must form the *closure* of Z under the
operation of division, the inverse of multiplication. (The integers Z
themselves are the closure of the set N of natural numbers under subtraction, the inverse
of addition.) The result when we so extend Z is the set Q of *rational numbers*,
which are all numbers that can be expressed as a ratio of integers, *p* / *q*.
We note that multiplication is commutative over Q, so that it is actually a more specific
type of division ring, called a *field*.

**Fields**. A *field* is a division ring for which multiplication is
commutative; that is, in addition to (1) through (10), a field F satisfies the following:

(11) **Commutativity of Multiplication**. For all *a*, *b* in F,
*ab* = *ba*.

Several well-known sets in mathematics are fields: the set Q of rational numbers,
mentioned already, the set R of real numbers, and the set C of complex numbers. All of
these fields are *unique* in the sense that *any* field which has the particular
set of properties that makes them unique must be isomorphic to them. (I won't define the
term *isomorphic* explicitly here, but intuitively, two sets are isomorphic if there
is a complete one-to-one correspondence between them that "preserves structure"--leaving
that phrase undefined for now, but that gives you an idea of what's going on.) Most of the
rest of this article will be devoted to examining the special properties of each of these
fields that make them unique.

All of these sets are infinite, but just as with groups and rings above, there are also
finite fields. In fact, the set Z_{3} that we looked at above is an example: it should
be obvious that multiplication is commutative, and if you think about it a little, you will
see that each nonzero element also has a multiplicative inverse (which is the key property
that makes it a division ring instead of just an ordinary ring). In fact, any set Z_{n}
where *n* is a prime number is a field. (What about Z_{n} when *n*
isn't prime? These aren't fields because some elements--in fact, any which are factors of
*n*--will not have multiplicative inverses, because they are "zero divisors": they
give zero when multiplied by some member of the set, since *n* is zero modulo
*n*. You should be able to convince yourself
easily that a zero divisor can't have a multiplicative inverse--though note that the converse
is not true: there can be rings which have no zero divisors but which also do not have
multiplicative inverses for all elements. A commutative ring with no zero divisors is
called an *integral domain*, because the integers Z are the best-known example of
one--and they also illustrate that there can be integral domains which are not division
rings, as I stated just now.)

**Ordering**. A *partial ordering* on a set S is a relation \ satisfying the
following properties:

(i) **Reflexivity**: *a* \ *a* for all *a* in S;

(ii) **Antisymmetry**: If *a* \ *b* and *b* \ *a*, then
*a* = *b* for *a*, *b* in S;

(iii) **Transitivity**: If *a* \ *b* and *b* \ *c*, then
*a* \ *c* for *a*, *b*, *c* in S.

Note that this set of properties is very similar to the three properties satisfied
by an *equivalence relation*; the only difference is property (ii), which for
equivalence relations becomes the property of **symmetry** (if *a* = *b*,
then *b* = *a*). The antisymmetry property is what makes the relation an
*ordering* relation rather than an equivalence relation.

Note also that there is no guarantee that for *every* pair of elements *a*
and *b* of S, at least one of *a* \ *b* or *b* \ *a* is true.
There may be some pairs of elements for which neither is satisfied (such pairs of elements
are called *incomparable*). This is why the relation \ is called a *partial*
ordering. If it *is* true that every pair of elements can be ordered, then the
relation is called a *total ordering* of S--this is sometimes expressed as a fourth
property which is satisfied by totally ordered sets (but not partially ordered sets):

(iv) **Trichotomy**: For all *a*, *b* in S, at least one of
*a* \ *b* or *b* \ *a* is true.

This fourth property is called "trichotomy" (even though as I've stated it here it
involves only two "options") because it is possible that both *a* \ *b* and
*b* \ *a* are true (which is then the third "option"), in which case by
antisymmetry *a* = *b*.
Sets which have a total ordering are called (reasonably enough) *totally ordered* sets.
(The term "ordered sets" is often used to mean
totally ordered sets, but it is sometimes used to include partially ordered sets as
well.)

The field Q of rational numbers is a totally ordered set; in fact, there are two basic
orderings that can be defined, *less than or equal to* (*a* ≤ *b*) and
*greater than or equal to* (*a* ≥ *b*). (Note that *a* < *b*
and *a* > *b*, less than and greater than, are strictly speaking not total
orderings, since neither holds if *a* = *b*. We will ignore this
technicality in what follows, and write *b* < *a* and *b* > *a*
freely, not worrying about possible cases where *b* = *a*.)
However, ordering alone is not enough to uniquely define Q, since the field R of real
numbers is also ordered. So what other properties might serve to distinguish Q?

**Density**. An ordered set S is *dense* if, given elements *a* and
*b* such that *a* ≠ *b*, there exists some element *c* which lies
*between* *a* and *b*. That is, there exists an element *c* such that:

(i) *c* ≠ *a* and *c* ≠ *b*;

(ii) If *a* < *b*, then *a* < *c* and *c* < *b*;

(iii) If *a* > *b*, then *a* > *c* and *c* > *b*.

It is easy to see that Q is dense: for any elements *a* and *b* such that
*a* < *b*, we simply take *c* = (*a* + *b*) / 2. However, this
trick will also work for R, so again the density property is not enough to uniquely define
Q. (We note, however, that the rationals Q are also* dense* in the reals R, and so
are the irrationals I: that is, between any two elements of Q there is an element of R
which is not in Q--and hence is in I--and between any two elements of I there is an
element of Q. Q and I are thus disjoint, *relatively dense* sets which together make
up R.)

We won't discuss the density property further in this article, but it is worth an aside
to note that the density property, applied to the reals R, is part of Cantor's definition
of the *continuum*, which was a watershed in the history of mathematics. For a couple
of centuries mathematicians had been using the calculus, which had been invented by Newton
and Leibniz, without having really rigorous foundations for some of its basic concepts;
Cantor's definition of continuity provided the key to constructing such a rigorous
foundation. (It also provided, for the first time, a way of avoiding the paradoxical
conclusions about motion which were enunciated by Zeno in the fifth century B.C., but
that's a topic for the philosophy section.)

**Well-Ordering**. A set S is *well-ordered* if there is an ordering
on S such that every non-empty open subset of S has a
*smallest element*, which is an element *a* such that there is no element
*b* for which *b* < *a*.

The set Q of rational numbers is well-ordered. We know this because Q is
*countable*, i.e., it is capable of being placed into a one-to-one
correspondence with the set N of natural numbers. All countable sets are well-ordered;
the well-ordering is given by the one-to-one correspondence with N, and the smallest
element of any open subset is simply the element corresponding to the smallest natural
number. A well-ordering of the entire set Q will have as its smallest element the number
put into correspondence with the number 0. However, this example also shows that a
well-ordering of a set cannot always use the "standard" ordering of the set, since any
method of putting Q into a one-to-one correspondence with N will have to use an ordering
that isn't the same as the "standard" ordering on Q, the one given by ≤ and ≥
that we saw above. (Food for thought: why is this true? Answering this will show why
we can't construct a well-ordering of Q using the "standard" ordering. Think about
the density property above. This also means that a well-ordering of a set may not
be compatible with the operations that make it a field, a ring, or even a group.)

The question of whether R is well-ordered is more complicated. It turns out that this
question depends on the question of whether the *Axiom of Choice* (AC) holds. The
Axiom of Choice says that, given any collection of non-empty sets, there is a set which
contains exactly one element from each member of the collection. AC seems obvious and
innocuous enough when stated, but it turns out that, when combined with the standard
Zermelo-Fraenkel axioms of set theory (ZF--the combined system is usually referred to as ZFC),
it implies a number of counterintuitive results, one of which is that *all* sets,
including R, must be well-ordered. However, AC gives us no idea *how* to well-order
a given set, such as R, and it is very difficult to see how one could possibly well-order
a set like R which is uncountable (see below). This is one of the objections to accepting
AC without qualifications, at least in its usual form. (Mathematicians can, and do, still
investigate all the implications of the combined system ZFC, but because of the issues that
arise with AC, they also look at alternative axioms that yield different sets of theorems,
some of them contradictory to theorems of ZFC. Since, as we've seen, some of the theorems
of ZFC are highly counterintuitive, the alternative systems can look attractive for some
types of applications. What all this boils down to is that, contrary to what most people
initially expect, there is no one unique axiom system that adequately captures *all*
of our intuitions about sets--or indeed about any mathematical objects. Many people don't
like this, and when it was first discovered in the twentieth century some mathematicians
darkly predicted the doom of the entire field, but one learns to cope with these things.)

So we can't use well-ordering either as a property to distinguish Q from R; but in
our discussion we've identified another property which we *can* use to do so:
*countability*. Cantor first showed in the late 1800's that R is not countable,
using a very slick argument called the *diagonal argument*, which has turned
out to be a very useful general trick for getting important results in mathematics.
I won't describe the argument here, but
Googling on "cantor diagonal argument" will turn up a number of good discussions of
it. The upshot of the argument is that no matter how you try to construct a one-to-one
correspondence between N and R, you can't do it--you will always end up leaving out an
infinite number of reals, and therefore the reals are not countable. So countability is
the crucial feature that distinguishes Q from R (and all
other ordered fields), and hence enables us, along with the other properties we have listed,
to uniquely define it:

*There is a unique ordered field which is countable, and it is the set Q of rational
numbers*.

Of course this immediately raises the question: if Q is the unique ordered field which
is countable, is R just the corresponding unique ordered field which is *not*
countable? Unfortunately, no; at the very least, there are other uncountable proper
subsets of R which are also ordered fields, so we have to look further to find a property
that uniquely picks out R itself.

**Sequences and Bounds**. A *sequence* is a countable subset of an ordered set
S with no two elements equal. A *bounded sequence* B is a sequence which satisfies
at least one of the following two conditions:

(i) If there is an element *u* of S such that, for every *a* in B,
*a* ≤ *u*, then B is said to be *bounded above*,
and *u* is an *upper bound* for B;

(ii) If there is an element *l* of S such that, for every *a* in B,
*a* ≥ *l*, then B is said to be *bounded below*, and *l* is a
*lower bound* for B.

A given sequence may have many upper or lower bounds. For example, consider the
sequence in Q defined by {*a*(*n*): *a _{n}* = 1/

There is, of course, something special about 0 and 1 for the above sequence, and it is captured by the following pair of definitions:

(i) An element *u* of S is said to be a *least upper bound* for a sequence B
in S if there is no upper bound *v* for B for which *v* < *u*;

(ii) An element *l* of S is said to be a *greatest lower bound* for a sequence
B in S if there is no lower bound *m* for B for which *m* > *l*.

Obviously, for the sequence given above, 1 is the least upper bound, and it only takes a
little more thought to see that 0 is the greatest lower bound. Finding least and greatest
bounds tells us that we have "pinned down" a sequence to the narrowest possible
range in its parent set S. (A bit of terminology: a least upper bound is often called a
*supremum*--abbreviated "sup"--and a greatest lower bound is often called an
*infimum*--abbreviated "inf".)

A question arises, however: is it always the case that any sequence in a set S which has an
upper bound in S will have a least upper bound in S? The answer, perhaps surprisingly,
turns out to be *no*. In fact, the set Q of rational numbers provides a counterexample:
there are sequences in Q which have upper bounds in Q, but which do *not* have a
least upper bound in Q! The shorthand way of saying this is that Q does not have the
*least upper bound property*. (We could also ask the converse question about lower
bounds and greatest lower bounds, of course; however, it turns out that one property
implies the other--any ordered set with the least upper bound property will also have
the converse *greatest lower bound property*. So we only need to talk about one of
the two in our discussion here.)

Let us look at an example. Consider the following sequence in Q: {1, 1.4, 1.41, 1.414,
1.4142, 1.41421, ...}. You will probably recognize this sequence as successive truncations
of the decimal expansion of √2. The squares of these numbers will get closer and closer
to 2 from below, without ever quite reaching it. Now let's pick out some upper bounds for
the sequence. We know that 1.5 is an upper bound, because the square of 1.5 is 2.25, which
is greater than 2. But we can find a smaller upper bound: 1.42. Its square is 2.0164,
still greater than 2. We can find a still smaller upper bound: 1.415. Its square is
2.002225. We can find yet a smaller...but you get the idea. No matter what upper bound we
choose in Q, we will always be able to find a smaller number in Q which is also an upper
bound; and this is because the least upper bound for the sequence, which is simply
√2 itself, is *not* in Q. The ancient Greeks proved this as a theorem, and it
threw them into consternation for a long time, until a few mathematicians (notably Eudoxos)
came along and produced an axiomatic system for dealing with what we now know as
*real numbers*.

The least upper bound of the sequence we just examined, √2, is of course in R, so
if we view it as a sequence over R instead of Q, then R has the least upper bound
property with respect to that sequence. It takes quite a bit more work to prove that
R has this property with respect to *every* sequence over itself. It turns out that
this is the distinguishing feature of R: that it has the least upper bound property.
That is,

*There is a unique ordered field with the least upper bound property, and it is
the set R of real numbers*.

You might wonder why the least upper bound property is so important. There are a number of reasons, but the primary one that motivated mathematicians to precisely define the reals was the desire to put the theory of limits, and thereby calculus, on a sound axiomatic footing. Without the least upper bound property, it would not be possible to ensure that all sequences have definable limits, which means that it wouldn't be possible to ensure that derivatives and integrals, the fundamental tools of calculus, were always well-defined.

We now have a unique combination of properties that picks out the reals. But is this the end of the line? That is, are the reals the "ultimate" mathematical set, which has all of the properties we might want? The answer, perhaps surprisingly, turns out to be "no"--there are properties that are "nice" for a set to have that the set of reals doesn't have. Let's look at the main one.

**Polynomials and Roots**. A *polynomial* over a set S is an expression of the
form P(*x*) = Σ (*i* = 0 to *n*)
*a _{i}x^{i}*,

The basic question in the study of polynomials is whether a polynomial over a set S
must have at least one root in S. A well-known counterexample shows that this is not the
case for Q or R: the polynomial *x*^{2} + 1 has all its coefficients in Q
(and hence R), but has no roots in R (and hence none in Q); that is, the equation
*x*^{2} + 1 = 0 has no real solutions. It was equations like this one
which helped to spur the extension of the number system to include *imaginary*
numbers. You are probably familiar with imaginaries, but let us take it slowly for a
bit and introduce them axiomatically.

Let *i* be a number such that *i*^{2} = -1. Obviously *i*
cannot be a real number; it must be some new type of number which is not included in
the set of reals. However, we can as a working hypothesis assume that *i* can be
combined with real numbers by the usual arithmetic operations, and see where that leads
us. Thus, we obtain the following:

(i) Any number of the form *bi*, with *b* in R, is called a *pure
imaginary* number. The squares of pure imaginary numbers are negative real numbers:
(*bi*)^{2} = -*b*^{2}. The pure imaginaries have a number of
other special properties with certain functions, in particular the exponential
*e ^{x}*, which we will explore below. Note that if

(ii) Numbers of the form *z* = *a* + *bi*, with *a*, *b*
in R, are called *complex numbers*. Complex numbers obey the same rules of algebra
as real numbers, with the added proviso that *i*^{2} = -1. It is easy to
show that the set C of complex numbers, so defined, obeys all of the axioms (1) through
(11) above that characterize a field. The real number *a* is called the *real
part* of *z*, and the real number *b* is called the *imaginary part*
of *z*.

(iii) The number *z** = *a* - *bi* is called the *complex conjugate*
of *z*.

(iv) The *modulus* |*z*| of a complex number *z* is the positive
real number *r* given by
*r*^{2} = *zz** = *a*^{2} + *b*^{2},
where *a* and *b* are the real and imaginary parts of *z*.

(v) Any complex number *z* can be written as the product of its modulus *r*
and the exponential *e ^{iq}*, where the real number

(vi) If *z* = *re ^{iq}*, then the
complex conjugate

(vii) Raising *z* to a power *a* is now simple: obviously
*z ^{a}* =

**Fundamental Theorem of Algebra**. The above propositions are enough to show that
the set C is "big enough" to allow us to do what we want with polynomials: we can find a
root in C for any polynomial with coefficients in C. This is called the *fundamental
theorem of algebra*, and is one of the most important results in higher mathematics.
The proof is rather involved, but the crucial step depends on property (vii) above, that
C is closed under the operations of raising to powers and extracting roots.

This property of closure under the process of taking roots of polynomials is one of the unique features which distinguishes the complex numbers. However, it turns out that there are other fields which are also "algebraically closed" (that's the technical name for the property we've just discussed). The key additional distinction here comes from the fact that all those other algebraically closed fields are generated from the rational numbers, Q; there are more than one, in fact an infinite number, because it turns out that there are an infinite number of ways of constructing algebraic closures of Q. The complex numbers, however, are the algebraic closure of the reals, R, and it turns out that there is only one unique way of doing that. So we could say that C is the unique field which is the algebraic closure of the reals, but in the spirit of the above definitions, we'd like to express this uniqueness in a way that doesn't bring in any of our other sets. We can do this by observing that the algebraic closure of any countable field must also be countable--which means that all of those other fields which are algebraic closures of Q are countable. C, by contrast, is obviously not countable (since it has R as a subset), and so we can express C's uniqueness better using its uncountability as the key property. That is,

*There is a unique field which is both uncountable and algebraically closed, and it is
the set C of complex numbers*.

Note that C is also distinguished from R by the fact that C, unlike R, is *not*
ordered: it is only partially ordered. One can impose a variety of partial orderings on C:
the most common one uses the modulus (defined earlier) to generate the ordering. The
partial ordering induced by the modulus is simply the following:
*z* < *w* iff |*z*| < |*w*|.
However, since there are infinitely many complex
numbers with a given modulus, this can only be a partial ordering (complex numbers
with the same modulus are incomparable under this partial ordering). (Food for thought:
why could I not use ≤ instead of < just now? The answer is easy if you think about
why numbers with the same modulus are incomparable.) The fact that C is partially
ordered, by the way, also explains why all those fields which are algebraic closures of
Q aren't counterexamples to the unique definition of Q (as the only countable ordered
field) that we gave above: all of those algebraic closures of Q are (countable) subsets
of C but not of R (because they have to include imaginary elements in order to provide
negative rational numbers with roots), so they aren't ordered the way Q is.

**Quaternions and Octonions**. I can't resist a digression here to talk about
the two "hypercomplex" sets, quaternions and octonions, which share with the complex
numbers the property that all polynomials over the set have roots in the set. (This does
not contradict the uniqueness of C as defined above, because neither the quaternions nor
the octonions are a field--the quaternions are non-commutative with respect to
multiplication, so they are only a division ring, and the octonions are non-commutative and
also non-associative, so they are not even, technically speaking, a group--they are a
"division algebra", which is a set that has a multiplication operation with an inverse,
but not all division algebras are groups because they're not all associative. It's harder, for
me at least, to intuitively grasp how a set can be non-associative under multiplication
than to grasp how it can be non-commutative, but it turns out that the non-associativity
of the octonions has some interesting consequences relative to other groups that are
important in physics. But I'm digressing from the digression.)

The quaternions were first discovered by the Irish mathematician William Rowan Hamilton in the nineteenth century, in the course of other investigations. As with group theory in general, nobody had the slightest idea that quaternions would ever have any use in the real world, but in the 1920's when quantum mechanics was being developed, Pauli (who was unaware of Hamilton's work) basically re-invented the quaternions as what are now called the "Pauli spin matrices" because they come into play when constructing wave functions for the "spin" degrees of freedom of elementary particles. (The matrices that Pauli used are actually the "Hermitian counterparts" of the quaternions that Hamilton used, but I won't digress from the digression again to explain what that means, except to say that they have fundamentally the same structure, they're just written in a somewhat different way because of the particular requirements of quantum mechanics.)

The idea behind the quaternions is simple to state, though it sounds wacky: instead of
there just being one "imaginary unit" *i*, as in the complex numbers, suppose that
there are *three*, which we will label *i*, *j*, and *k*. When we
say that all three of these things are "imaginary units", we mean that all three of them
square to -1. Now of course it was tough enough to accept that -1 had one "square root",
but how can it possibly have three? It looks as though all of them are
"just the same, only different". Well, however counterintuitive it sounds, it turns out
to be possible, in the sense that a fully self-consistent division algebra can be set
up using the four numbers, 1 and the three imaginary units, as "basis vectors". (It
would be another digression from the digression to explain that term, since I haven't
talked at all about vector spaces here--another article, maybe?--but for our purposes
now suffice it to say that 1 and *i* are basis vectors of the complex numbers C
in the same sense.) All we need is to define the "multiplication table" for the three
imaginary units, and we can then add and multiply any two quaternions, and prove that
those two operations satisfy all the axioms of a division algebra. (Food for thought:
why do we not need an "addition table" for the imaginary units? Answering this will
give you a better idea of what "basis vectors" means.)

Defining the multiplication
table is easy: we simply say that, by definition, *ij* = *k*, and then generate
the rest of the table by permuting this equality, with the sign of the right-hand side
remaining positive for even permutations but becoming negative for odd permutations (so,
for example, *jk* = *i*, but *ji* = -*k*). Since the division algebra
set up in this way turns out to be
non-commutative (food for thought: can you see that this is "obvious" from the multiplication
table I just gave?), it is indeed a different group from the "ordinary" complex numbers,
and qualifies as a distinct mathematical structure in its own right.

Once you've bought into this, then defining the octonions is easy: instead of three
imaginary units, the octonions have seven (usually denoted e_{1} through
e_{7}), which with the identity 1 makes a total of eight basis vectors for
this new mathematical space. (It should now be clear where the names "quaternion"
and "octonion" come from--the prefixes just refer to the number of basis vectors in
the space.) Since the octonions, defined in this way, are not only non-commutative but
non-associative, they again are clearly a distinct mathematical space in their own right.

You may ask, how long can this go on? Can we have a set of "sextadecions" with sixteen
basis vectors (the identity 1 plus fifteen--count them--imaginary units)? And so on,
*ad infinitum*? Or, to take another tack, why must the number of basis vectors be
a power of two? Why do we not have such sets with three or five or some other number of
basis vectors? (Food for thought: we can, if we like, pick out a division algebra
sharing a number of
properties with these sets--though not all of them or even some of the most
important ones--that has only one basis vector. What is it?) Can't we
construct similar mathematical spaces with any number of basis vectors we like?

The answer (just to keep everyone guessing) turns out to be no. Of course we can construct general vector spaces with any number of "dimensions" we like, but (like the "one-dimensional" vector space R) they won't have a number of important properties (such as having roots for all polynomials). It turns out (though I'm certainly not going to try to explain why here, since I don't entirely understand it myself) that there are only three possible "hypercomplex division algebras" that do have all of those important properties, and they are the three sets we have just discussed: C, H (the quaternions), and O (the octonions). And again, this abstract mathematical fact turns out to have effects in physics, in particular with the number of possible different "theories of everything" that there are having certain structures, and how many dimensions these theories can have. But I'm getting out of breath, so let's end this article with something that's (comparatively) simple.

**The Unit Circle**. We've finished the discussion of basic group theory and the
properties of the important sets Q, R, and C; but for "extra credit" let's examine the subset
of C with *r* = 1; that is, the set U, the "unit circle", defined as
{*z*: *z* = *e ^{iq}*,

The group U(1) is an example of a *Lie group*, which is a special type of
group that turns out to be very useful in physics. This is
because all of the *symmetry groups* we find in nature
are Lie groups. A "symmetry group" is a group whose elements
are symmetry transformations--that is, transformations that
leave some property of the transformed objects invariant.
Symmetry groups in nature are groups of transformations that
can be applied to the various quantities that appear in the
laws of physics, but which
leave the laws of physics themselves unchanged. Now in the
laws of physics there are also what are called "conservation
laws", which are laws that say that a particular physical
quantity (like momentum, energy, etc.) is unchanged in
various physical processes. It turns out that each group
of symmetry transformations "corresponds" to a particular
conservation law (I won't go into the details
of why this is true, since it's not necessary for the present
discussion, but it's a fascinating topic).

For example,
consider a very "easy" group of symmetry transformations,
called "translations"--or more precisely, "spatial
translations". If you think of a set of spatial Cartesian
coordinates (mutually perpendicular axes), then spatial
translations simply move the origin of the set of
coordinates, without changing the orientation of the axes.
It turns out that all of the laws of physics are left
unchanged by this transformation of coordinates,
and this fact turns out to correspond to the law of conservation
of momentum--more precisely, the fact that the laws of physics
are invariant (unchanged) under spatial translations explains
why there *is* a law of
conservation of momentum. Similarly, the fact that the laws
of physics are invariant under "time translations"
(changing the "zero point" of time, like resetting your watch)
corresponds to the law of conservation of energy, and the
fact that they are invariant under
spatial rotations corresponds to the law of conservation of
angular momentum.

So how does the group U(1) fit into this picture?
What symmetry transformations are elements of U(1)?
It certainly doesn't look like they are anything simple
like time or space translations, or rotations. And
they're not. It turns out that the
classical laws of electromagnetism, Maxwell's Equations,
exhibit a symmetry called (for historical
reasons) a "gauge symmetry". The name really doesn't tell
you much about how this symmetry works, but the idea is
that you can take the fields that appear in Maxwell's
Equations and multiply them by elements of U(1) (i.e.,
complex numbers of the form *e ^{iq}*),
and the equations themselves are left unchanged by
this transformation--in other words, the laws of
electromagnetism are invariant under these "gauge
transformations" of the fields. Now this fact also corresponds
to a conservation law: as you might have guessed
from my mention of Maxwell's Equations, it's the law
of conservation of electric charge.

This in itself doesn't seem earth-shaking, and it isn't--but in the middle of the twentieth century, when the number of subatomic particles kept growing and growing as higher energy experiments were done, and physicists were searching frantically for some way of bringing order to the chaos, they got a lot of mileage out of this idea of looking for symmetry groups. By that time, of course, quantum theory had taken over, and so the U(1) symmetry was not thought of as a symmetry of the electromagnetic field in the classical Maxwell sense, but as a symmetry of the photon, the particle of light. So the question became: can we find more complicated symmetry groups that will provide a structure to account for other known particles?

This line of thought turned out to be extremely fruitful, yet another example of an abstract mathematical structure turning out to have a deep physical significance. In the 1960's and 1970's it was shown that the weak force and the electromagnetic force were two aspects of the same force, the "electroweak" force, which had an SU(2)xU(1) symmetry group structure (the "x" here basically indicates that the SU(2) and U(1) groups are combined in the "obvious" way to form a larger group). Similarly, the strong nuclear force (the force that holds atomic nuclei together) was shown to have an SU(3) symmetry group structure. (The groups SU(2) and SU(3) are higher-dimensional "relatives" of U(1), and are also Lie groups--in fact, in some ways they relate to the quaternions and octonions, which I discussed above, in the same way as U(1) relates to the complex numbers. But that would be a much bigger digression.)

The current "standard model" of particle physics, which includes all three of the above forces (i.e., all forces except gravity), is thus based on equations having an SU(3)xSU(2)xU(1) symmetry group structure. One line of attack used by physicists who are currently searching for a "Theory of Everything" is to look at still higher-dimensional groups that have SU(3)xSU(2)xU(1) as a subgroup, and therefore might be candidates for the overall symmetry group of everything.

And now I really am out of breath.

If you want more information about Lie Groups
and their relationship to physics (specifically quantum mechanics--but
really that *is* their relationship to physics, since all
of the correspondences I noted above between symmetry groups and
conservation laws and so forth ultimately arise from quantum
mechanics), there are two good articles on John Baez' web
site (which is also one of the sites that mirrors the
Usenet Physics FAQ):
one on Lie Groups
and Quantum Mechanics, and one on
Elementary
Particles and the symmetry groups underlying them.

Also useful is the MathWorld Entry on Lie Groups. It has links to a lot of other MathWorld articles on related concepts, which are useful in getting your bearings if you're not used to the subject.

Also, a note about the quaternions and octonions and vector spaces. I said that while it is possible to construct general vector spaces in any number of dimensions, it is *not* possible to construct hypercomplex division algebras in any number of dimensions. I didn't want to go into details here (since I don't fully understand them), but the basic reason can be seen from the following example. Suppose we try to construct a three-dimensional hypercomplex division algebra; in other words, instead of just one imaginary unit (square root of -1) we try to have two (i and j). The two of them will have to anticommute (that is, ij = -ji) for things to work out right, and given that, if you work out the square of ij (call this k), you will find that it is also -1! That is, you can't have a hypercomplex division algebra with just two imaginary units--as soon as you have two, you automatically get a third.

(Why do i and j have to anticommute? Because if they commute then they are really the same; work it out and you'll see that j has to be equal to either i or -i if i and j commute, so j isn't really a different basis vector unless it anticommutes with i.)