Maps to Enlightenment

Reading “Conceptual Mathematics” by F. William Lawvere & Stephen H. Schanuel.

Last time we meditated on sets, which may be the simplest entities in mathematics. We observed that a set can be regarded as a very special kind of category, namely a discrete category, where the only relationship is the trivial one of identity. While it’s enlightening to see how some mathematical objects can be regarded as special cases of the general notion of category (and later we will see other examples: posets, monoids, groupoids), this is not how categories are commonly used in practice. Instead, when we want to learn more about sets, we do that by constructing the category of sets.

Our first step is to take all the sets (or maybe just the finite sets for now), and reduce them to points, so that each set becomes a featureless object, like the elements of an abstract set, or like the points in geometry which may lay on lines but have no properties in and of themselves. Now, this may at first seem like a counterproductive way to study something, by reducing all its examples to inscrutable black boxes; but the Big Idea here is that it’s not really about the sets themselves, but how they relate to one another. So we are not really studying sets so much as we are studying a system of transformations of sets; i.e. a set theory, because in the end, what’s really important about a set (or a symmetry group, or a topological space, or a differentiable manifold, or pretty much any mathematical object you can think of) is not what it is—whatever that might mean—but what we can do with it. If I were a philosopher, I might even try to argue that, as finite creatures, we can never really know what a thing is in its absolute being, but only experience it in its relationship to/interaction with other things; but since I am just a humble program writer, I will be content to observe that, within the limited domain of informatics, there appears to be a broad consensus that it’s not what a software object is that matters, but what we can do with it, as evidenced by software engineering’s preoccupation with abstract data types, application programming interfaces (APIs), Haskell’s type classes, and Stepanov’s C++ “concepts”. This emphasis on behavior over implementation—form over substance—is really at the heart of the categorical method. As the Nobel-winning chemist Fredrick Soddy put it:

All saints and sages throughout the ages
From one doxy never have swerved
To hold fast unto what in change changes not
And ferret out what is conserved.

As a workaday computer programmer trying to understand something as abstruce as category theory, I felt tremendous relief that the first example of a category that Lawvere and Schanuel give in Article I is the category of sets and functions. Most books about category theory use examples like “modules” and “topological spaces”, which are not exactly familiar to most programmers. (To a programmer, a “module” means something rather different from the mathematician’s idea.) But “functions”? Every programmer knows what a function is! There is even an increasingly popular paradigm called “functional programming” where the programmer strives to model everything in terms of functions. Designed by a mathematician, Lisp (1959) was the first language intended to make functional programming convenient, and I feel very fortunate to have stumbled upon it early in my programming career, as functional programming is a natural technique to use in Lisp—indeed, Lisp’s design led me to discover functional programming before I even knew it had a name!—while one cannot really say the same about several other popular programming languages which evolved more as pragmatic abstractions of assembly language than any kind of coherent mathematical theory. (Though I appreciate attempts by Stepanov et al to add some much-needed mathematical rigor to C++, it still feels very clunky to me, spoiled as I am by the expressive elegance of Lisp and Haskell!)

Functional programmers compose functions all day long—that’s literally what it means to program in a functional style. So for me, Article I was both a great relief (finally a book on category theory that is easy to understand!), and an interesting perspective on some very familiar ideas. While I have certainly defined many functions in my 20 year programming career, I have not always taken care to explicitly document the input set (“domain”) and output set (“codomain”) of every function I wrote (though HTDP and Hoogle have recently convinced me that it is an essential practice, so I am now trying to get into the habit). Lawvere and Schanuel clearly stress the importance of fixing a domain and codomain set for every function (programmers would call those the “input and output data types”). Lawvere and Rosebrugh even go so far as to say in Sets for Mathematics that “the additional century of experience since Cantor has shown the importance of emphasizing [that] each map needs both an explicit domain and an explicit codomain (not just a domain, as in previous formulations of set theory, and not just a codomain, as in type theory).” So they are not just throwing a definition of “function” out there willy-nilly—it’s based on a century of hard-wrought mathematical experience.

While the term “function” is very familiar to programmers, “map” is probably less so, though there are some standard functions in Lisp that essentially compute internal diagrams like the ones in Article 1, e.g. from The Wizard Book:

(define (map proc items)
  (if (null? items)
      nil
      (cons (proc (car items))
            (map proc (cdr items)))))
(map abs (list -10 2.5 -11.6 17))
(10 2.5 11.6 17)
(map (lambda (x) (* x x))
     (list 1 2 3 4))
(1 4 9 16)

This could be interpreted as computing the codomain sets for the absolute value and square maps restricted to domains of four numbers each.

I don’t remember hearing the term “map” used as a synonym for “function” before reading Conceptual Mathematics, so I had to think about why that word was used. Encyclopedia Britannica says it’s because mathematical maps are an abstraction of geographical maps, but how is that, exactly? To make a geographical map, you visit a place, and then make a mark on a piece of paper (or some pixels on a bitmap) to represent what you saw there. The idea is that every mark on the paper map corresponds to something in the place that is being mapped, so I guess the domain is like the paper map, and the codomain (which is usually bigger than the actual set of values of the map) is like the real world to which the map corresponds. However, I think we could also see the codomain as being like the paper map: everything in the domain being mapped corresponds to something on the map, and some details might get smushed (like a mountain range might be reduced to a single mountain symbol), and perhaps some things on the map don’t really correspond to anything in reality (like a fanciful illustration of a giant sea monster), but the general idea of correspondence/resemblance is still there.

Lawvere and Schanuel are careful to say that every map f of sets has three ingredients: a domain set A, a codomain set B, and “a rule (or process) for f, assigning to each element of the domain A exactly one element of the codomain B.” They don’t dwell on the distinction between a “rule” and a “process” because it’s not really important for mathematics, but for computer programmers, there is a crucial difference between these, as The Wizard Book explains:

Procedures […] are much like ordinary mathematical functions. They specify a value that is determined by one or more parameters. But there is an important difference between mathematical functions and computer procedures. Procedures must be effective.

As a case in point, consider the problem of computing square roots. We can define the square-root function as

undefined

This describes a perfectly legitimate mathematical function. We could use it to recognize whether one number is the square root of another, or to derive facts about square roots in general. On the other hand, the definition does not describe a procedure. Indeed, it tells us almost nothing about how to actually find the square root of a given number. It will not help matters to rephrase this definition in pseudo-Lisp:

(define (sqrt x)
  (the y (and (>= y 0)
              (= (square y) x))))

This only begs the question.

The contrast between function and procedure is a reflection of the general distinction between describing properties of things and describing how to do things, or, as it is sometimes referred to, the distinction between declarative knowledge and imperative knowledge. In mathematics we are usually concerned with declarative (what is) descriptions, whereas in computer science we are usually concerned with imperative (how to) descriptions.

Though I like the idea of “declarative” vs “imperative” knowledge, I think most workaday programmers would simply say that this illustrates the difference between a specification and an implementation. Mathematicians (and informaticians doing high-level modelling) can work at the level of specifications, but when it comes time to actually compute something, nothing short of an effective implementation will do.

There is a tension between these two perspectives on what a function is, which Paul Taylor feels particularly strongly about:

For Leonhard Euler (1748) and most mathematicians up to the end of the nineteenth century, a function was an expression formed using the arithmetical operations and transcendental operations such as log. The modern informatician would take a similar view, but would be more precise about the method of formation (algorithm). Two such functions are equal if this can be shown from the laws they are known to obey.

However, during the twentieth century mathematics students have been taught that a function is a set of input-output pairs. The only condition is that for each input value there exists, somehow, an output value, which is unique. This is the graph of the function: plotting output values in the y-direction against arguments along the x-axis, forgetting the algorithm. Now two functions are equal if they have the same output value for each input. (This definition was proposed by Peter Lejeune Dirichlet in 1829, but until 1870 it was thought to be far too general to be useful.)

These definitions capture the intension and the effect (extension) of a function. Evaluation takes us from the first to the second, but it doesn’t say what non-terminating programs do during their execution, and can’t distinguish between algorithms for the same function. But each view is both pragmatic and entrenched, so how can this basic philosophical clash ever be resolved? Chapter IV begins the construction of semantic models which recapture the intension extensionally, as part of our reconciliation of Formalism and Platonism.

He goes on to say in Chapter IV that “category theory unifies the symbolic (Formalist) and model-based (Platonist) views of mathematics. In particular it offers an agnostic solution to the question that we raised in Section 1.3 of whether a function is an algorithm or an input-output relation.” I think what he meant by that is demonstrated by Lawvere and Schanuel in Article I and Session 2, where we see that the category of sets captures, respectively, both the input-output values of a mapping (its “points”), and all the composite mappings which are equal to it.


About this entry