22 October 2006

Negative Dimensions

Since I'm behind in my series of posts on fields, quantum or otherwise, I will instead talk today about some linear algebra, and not define most of my terms.

The category Vect of vector spaces (over generic field \R = "real numbers") nicely generalizes the category Set of sets. Indeed, there is a "forgetful" functor in which each set forgets that it has a basis. Yes, that's the direction I mean. A vector space generalizes ("quantizes") in a natural way the notion of "set": rather than having definite discrete elements — two elements in a set either are or are not the same — a vector space allows super-positions of elements. A set is essentially "a vector space with a basis": morphisms of sets are morphisms of vector spaces that send basis elements to basis elements. So our forgetful functor takes each set X to the vector space Hom(X,\R) (Hom taken in the category of sets). But, I hear you complain, Hom(-,\R) is contravariant! Yes, but in this case, where I forgot to tell you that all sets are finite and all vector spaces finite-dimensional, we can make F = Hom(-,\R) covariant by F(\phi): f \mapsto g(y) = \sum_{x\in X s.t. \phi(x)=y} f(x). Actually, of course, if I'm allowing infinite sets, then I should specify that I don't quite want X \to Hom(X,\R), but the subspace of functions that send cofinitely many points in X to zero.

Anyhoo, so Set has an initial object 1 = {one element} and a terminal object 0 = {empty set}, and well-defined (up to canonical isomorphism) addition and multiplication (respectively disjoint union and cartesian product). These generalize in Vect to 1 = \R and 0 = {0}, and to direct sum and tensor product; if we identify n = "\R^n" (bad notation, because it's really n\R; I want n-dimensional space with a standard basis, so the space of column vectors), then it's especially clear that sums and products are as they should be. So Vect is, well, not quite a rig (ring without negation), because nothing is defined uniquely, but some categorified version, where all I care is that everything be defined up to canonical isomorphism (so, generically, given by a universal property).

But I can do even better. To each vector space V is associated a dual space V^* = Hom_{Vect}(V,\R), and most of the time V^{**} = V. (I need to learn more linear algebra: I think that there are various kinds of vector spaces, e.g. finite-dim ones, for which this is true, and I think that there's something like V^* = V^{***}. If so, then I should certainly always pass immediately from V to V^{**}, or some such; I really want dualing to be involutive.) By equals, of course, I always mean "up to a canonical isomorphism". Now, V\times V^* = Hom(V,V) is rather large, but there is a natural map Trace:Hom(V,V)\to\R, and this allows us to define a particular product "." which multiplies an element v\in V with w\in V^* by v.w = Tr(v\tensor w). Then . is multi-linear, as a product ought to be, and we can thus consider V.V^* = \R. Indeed, we can imagine some object 1/V that looks like V^* — a physicists wouldn't be able to tell the difference, because their elements are the same — so that V \tensor 1/V = \R. (Up to canonical isomorphism. It's not, of course, clear which copy of V we should contract 1/V with in V\tensor V. But either choice is the same up to canonical isomorphism.) There is even a natural trace from, say, \Hom(2,4) \to 2 — take the trace of the two 2x2 squares that make up the 4x2 matrices — "proving" that 4/2 = 2.

So it seems that, well, Vect is not a division rig, but it naturally extends to one. But what about that n in "ring"? What about negative dimensions? This I don't know.

See, it's an important question. Because, consider the tensor algebra T^{.}(V) = \R + V + V\tensor V + ... — this is an \N-graded algebra of multilinear functions on V^*. This looks an awful lot like the uncategorified 1+x+x^2+..., which we know is equal to 1/(1-x). (Why? Because (1-x)(1+x+...) = 1-x+x-x^2+x^2-... = 1, since every term cancels except for the -x^\infty, which is way off the page.) Anyhoo, so we ought to write the tensor algebra as 1/(1-V).

Which doesn't make any sense at all. 1-V? Well, we might as well define 1-V as dual to the tensor algebra: there should be a natural way to contract any element of 1-V with any multilinear function on V^*. But this has a much shorter algebraic expression, which ought to have Platonic meaning. So, what's a natural object that we can construct out of V that contracts (linearly) with all multilinear functions to give real-valued traces?

If we could answer this, then perhaps we could find out what -V is. How? Not, certainly, by subtracting 1=\R from 1-V. No, I suggest that whatever our proposal be, we then try it on 1-2V = (T^.(V+V))^* = 1/(\R + V+V + (V+V)\tensor(V+V) + ...), and compare. What out to happen is that there should be some natural object W such that 1-2V = W + 1-V, and it should turn out that 1-V = 1 + W. Whatever the case is, there should be a natural operation that "behaves like +" such that 1-V + V = 1. It's certainly not standard direct sum, just like how V \times 1/V is not the standard tensor product. But it should be like it in some appropriate sense. Most necessarily, it should satisfy linearity: if v_1,v_2\in V and w_1,w_2\in W, then v_1+w_1 and v_2+w_2 \in V+W should sum to (v_1+v_2)+(w_1+w_2). And, of course, if you have the right definition, then all the rest of arithmetic should work out: 1/(-V) = -(1/V), -V = -\R \times V, (-V)\times W = -(V\times W), and, most importantly, --V = V (up to canonical isomorphism).

One can go further with such symbolic manipulation. You've certainly met the symmetric tensor algebra S^{.}(V) of multilinear symmetric tensors, and you've probably defined each graded component S^{n}(V) as V^{\tensor n} / S_n, where by "/ S_n" I mean mean "modulo the S_n action that permutes the components in the n-times tensor product." (If you are a physicists, you probably defined the symmetric tensors as a _subspace_ of all tensors, rather than a quotient space, but this is ok, because the S_n identification generates a projection operator Sym: \omega \to (1/n!)\sum_{\pi\in S_n} \pi(\omega), and so the subspace is equal to the quotient. At least when the characteristic of the ground field is 0.) Well, S_n looks an awful lot like n!, so the symmetric algebra really looks like 1 + V + V^2/2! + ... = e^V. Which is reasonable: we can naturally identify S^{.}(V+W) = S^{.}V\tensor S^{.}W.

It's not quite perfect, though. The dimension of S^{.}V, if dim V = n, is not e^n, but 1 + n + n(n+1)/2 + n(n+1)(n+2)/6 + ..., which is only correct in the limit n\to\infty. Well, so why is that the dimension? When we symmetrize v\tensor w to 1/2(vw+wv), we generically identify different tensors. But v^2 symmetrizes to itself. Baez, though, says how to think about this: when a group action does not act freely, we should think of points like v^2 as only going to "half" points. So, for example, the group 2 can act on the vector space \R in a trivial way; we should think of \R/2 as consisting of only "half a dimension".

Anyway, the point is that we can divide by groups, and this is similar to our division by (dual) vector spaces: in either case, we are identifying, in a linear way, equivalence classes (either orbits or preimages).

Now, though, it becomes very obvious that we need to extend what kinds of spaces we're considering. Groups can act linearly in lots of ways, and it's rare that the quotient space is in fact a vector space. Perhaps the physicists are smart to confuse fixed subspaces and quotients: it restricts them just to projection operators. But, for instance, if we mod out \C by 2 = complex conjugation (which is real-linear, although not complex-linear), do we get \R or some more complicated orbifold? Is there a sense in which \R/2 + \R/2 = \R, where 2 acts by negation? \R/2 is the ray, so perhaps the direct sum model works, but you don't naturally get \R, just a one-dim space? To give interesting physics, it would be nice if these operations really did act on the constituent parts of each space. And what about dividing by 3? Every field has a non-trivial square root of 1, but only \C has nontrivial nth roots. So perhaps we really should just work with Vect of \C-linear spaces. Then we can always mod out by cyclic actions, but we don't normally get vector spaces.

Of course, part of the fun of categorifying is that there are multiple categorical interpretations of any arithmetic object: 6 may be the cyclic group C_6 = C_2 \times C_3, but 3! is the symmetric group S_3, and the groups 4 and 2x2 are also unequal. But if we come up with a coherent-enough theory, we ought to be able to make interesting discussion of things like square roots: there's an important sense in which the square root of a Lorentz vector is a spinor, and we should be able to say (1+V)^{1/2} = 1 + (1/2)V + (1/2)(-1/2)V^2/2 + (1/2)(-1/2)(-3/2)V^3/3! + ....

Overall, the move from Set to Vect deserves to be called "quantization" — well, really quantization doesn't yield vector spaces but (complex) Hilbert spaces, so really it should be the forgetful functor Set \to Hilb. If we have a coherent theory of how to categorify from numbers to Set, then it should match our theory of how to categorify from numbers to Hilb. And, ultimately, we should be able to understand all of linear algebra as even more trivial than how we already understand it: linear algebra is simply properly-categorified arithmetic.

10 October 2006

Tangent vectors and their fields

Voice-over: "Last time, on Blogging My Classes, Blogging My Fields,"
Screen flashes with images of surfaces and atlases. Main character says something cliche (but stunning because of the background music) about the definition of the manifold. Then screen switches to the final scene: The Scalar Bundle.
Voice-over: "And now, the continuation."

Classically, the tangent bundle T(M) to a manifold M was defined by taking equivalence classes of (parameterized) curves at each point, equivalent if they're tangent there. Slightly more universally, we can take our atlas of patches, and on each patch, consider the (locally trivial) bundle of tangent spaces to \R^n, then modding out by the transition functions between patches. But there is a better, more algebraic way to develop tangent vectors, directly from the sheaf of differentiable functions.

Within the space of linear functionals on C^\infty(M), consider those that are "derivations at the point p": l:C^\infty(M)\to\R should satisfy, for all f,g, l(fg) = f(p)l(g) + g(p)l(f). Of course, derivations at points of constant functions return zero, and one can check that derivations at points don't care about the value of the function outside a nbhd of the point, by considering bump functions. Given a coordinate patch x^i, the m derivations \d/\d x^i |_p (derivative in the i'th direction, evaluated at p) are examples, and it turns out that these form a basis for the (linear) space of derivations at p. (This is not entirely obvious. In coordinates, it comes from the fact that I can write any f(x):\R^n\to\R as f(x) = f(0) + \sum g_i(x) x^i (in a small nbhd of 0), by letting h_x(t) = f(xt) and thus g_i(x) = \int_0^t h_{x^i}(u) du.) So we have, given an n-dimensional manifold, n dimensions worth of derivations at each point.

Now, intuitively, any tangent vector gives a derivation at its basepoint, by differentiating the function "in the direction of the vector". And, intuitively, there are n dimensions worth of tangent vectors. So we can define a tangent vector at p to be a derivation at p.

Thus, it's clear that a vector field is exactly a derivation: a field worth of derivations, one at each point. Indeed, any derivation — an algebraically-defined object, exactly a linear operator L from C^\infty(M) to itself that satisfies the Leibniz rule L(fg) = L(f)g + fL(g) — gives us a derivation at each point: L_p(f) = L(f)(p). (And, by chasing definitions, two derivations agree iff they agree at each point.) More generally, we can talk about the sheaf of (tangent) vector fields, by asking about derivations of functions defined only on various open sets.

It's worth now mentioning cotangent vectors, and specifying some notation. Of course, to any vector space (e.g. T_p(M), the tangent vectors at p), we can define the dual space (of linear functionals). By linear algebra, if dim(V)<\infty, then the dual space has the same dimension; given a basis, we can construct a dual basis. Working now with manifolds, given any function f, I can get a cotangent field df defined by df(v) = v[f], where we think of v as a derivation. In particular, by the claim I made above about being able to write f in some local normal form, given a coordinate system x = {x^i} on a nbhd U, it's clear that the {dx^i} are a basis for the space of sections of T^*(U) as a module over C^\infty(U). (Similarly, the partials \d/\d x^i are a basis of {sections of T(U)} as a module over functions.)

Following the physicists' convention, I will usually just write p_i for the cotangent field p_i dx^i, and similarly I will usually just write q^i for the vector field q^i \d/\d x^i. (Continuing the summation convention.) This works, because dx^i \d/\d x^j = \delta^i_j, so (p_i dx^i)(q^j \d/\d x^j) = p_i q^j \delta^i_j = p_i q^i, so the dot-products work out right. This is only because I happen to be using a basis and its dual basis. Eventually, I may redefine the index conventions truly coordinate-independently, but for now let's maintain the convention that whenever we interpret our formulas in terms of coordinates, we always use dual bases for T and T^*.

Next time, I'd like to talk more about tensors, metrics, and similar structures. In particular, I'd like to define the Lorentz group and classify its representations.

09 October 2006

A new class of entries

I think I might like to spend some time thinking about definitions in mathematical physics. What is a quantum field, for instance? Physicists usually give a slightly incoherent answer: a quantum field is a quantum particle at every point, just like a field is a number at every point. You ask them to unpack this a bit, and some might remember that there may be global — what the physicists call "topological" — issues with such a definition, but for now let's only be concerned with the local definition, where a field is a function. So what should a quantum field be?

Conveniently, I'm taking three classes right now on related questions: Differential Geometry, Geometric Methods to ODEs, and Quantum Field Theory. I would like to start a series of entries blogging those classes, and relating it back to such foundational questions. I hope to get to answers involving infinitesimals: Robinson's "Non-standard Analysis", or Kock's "Synthetic Geometry". I don't have the answers yet.

What's most important about fields is their geometric nature. Like the physicists and the classical differential geometers, I may from time to time refer to coordinates, but ultimately I'd like a coordinate-invariant picture — indeed, one without coordinates at all. I also hope to ask and answer issues about how to regularize our fields, by which I mean "how continuous should they be?" This is an extremely non-trivial question: not only is it extremely unclear how to demand that two
"nearby" "quantum particles" be "similar" (we can demand as much of classical fields: for any epsilon, there should be a delta at each point so that within the delta ball at that point the fields don't vary more than epsilon; perhaps we should find the right metric on Schrodinger-quantized particles?), but the physicists don't even want to be stuck with, say, C^\infty fields. They want \delta functions to work within their formalism. And yet they adamantly refuse to consider "pathological" fields that are too "wildly varying".

Eventually, it would be nice also to understand the Lagrangian and Hamiltonian, and this almost-symmetry between position and momentum. For now, I'd like to end this entry with some basic definitions.

Manifolds: There are many equivalent definitions of a manifold. Since the physicists and classical geometers like to work with coordinates (replacing geometry-defined, invariant objects with coordinate-defined, covariant objects), I'll use the definition that mentions coordinates explicitly. A manifold is a (metrizable) topological space M with a maximal atlas — to each "small" open set U in M we assign a module of "coordinate patches" \phi:U\to\R^n, which should be homeomorphisms, subject to some regularity condition: if \phi:U\to\R^n and \psi:V\to\R^n, then \phi\psi^{-1} should be, say, smooth wherever it's defined. In general, modifying the word manifold modifies the condition on \phi\psi^{-1}: a C^\infty manifold has that all the \phi\psi^{-1}'s are C^\infty, for example. I will generally be interested only in C^\infty (aka "smooth") manifolds, although once we understand what kinds of functions the physicists are ok with, we may change that restriction. For a manifold, I demand that the atlas be maximal in the sense that it list all possible coordinatizations consistent with the smoothness condition. It is, of course, sufficient to simply cover our space with (coherent) patches, defining the rest as all other possibilities.

So that we can generalize this definition if we need to, it would be nice to reword this definition in the language of sheaves. The god-given structure on a smooth manifold is exactly enough to tell which functions are differentiable: a sheaf is a topological space along with a ring of "smooth functions" on each open set, so that the function rings align coherently (in full glory, a sheaf is a (contravariant) functor from the category of open sets in the space to the category of commutative \R-algebras whatever your sheaf is of, along with some "local" axioms, which ultimately say that to know a function I need exactly to know it on an open cover). I probably won't use this description, largely because I don't know what other conditions I would want to put on my sheaf in order to make it into something like a smooth manifold. Clearly every manifold generates a sheaf, and I have it on good authority that if two manifolds have the same sheaf, then they are the same manifold.

So what about our most-important of objects: a field? A field is a "section" of a "bundle".

Let's start with the latter of those undefined words. To each point p\in M, we associate a (for now) vector space V_p, called the "fiber at p". And let's (for now) demand some isotropy: V_p should be isomorphic to V_q for any given p and q in M, although not necessarily canonically so. (When we move to the realm of infinite-dimensional fibers, we may demand only that the fibers be somehow "smoothly varying" — I'm not sure yet how to define this. So long as everything is finite-dimensional, the isomorphism class of a fiber is determined by an integer, and integers cannot smoothly vary, so it suffices to consider bundles where the dimension of the fibers is constant.)

There should be some sort of association between nearby fibers: locally (on small neighborhoods U) the bundle should look like U\times V. So I ought to demand that the bundle be equipped with a manifold structure, which aligns coherently with M: a bundle E is a manifold along with a projection map \pi_E : E\to M, such that the inverse image of each point is a vector space. This is the same as saying that among the coordinate patches in E's atlas, there are some of the form \Phi: \pi^{-1}(U) \to \R^(n+k), (where, of course, n is the dimension of M and k is the dimension of each fiber) so that \Phi = (\phi,\alpha), where \phi is a coordinate patch on M and \alpha is linear on each fiber. We can naturally embed M\into E by identifying each point p\in M with (p,0) in E (where 0 is the origin of the fiber at p).

I will soon make like a physicist and forget about global issues, but I do want to provide one example of why global issues are important: the cylinder and the mobius strip are both one-dimensional ("line") bundles over the circle. The latter has a "twist" in it: as you go around the circle, you come back with an extra factor of -1.

So what's a section of a bundle? A (global) section is a map s:M\to E so that \pi s:M\to M is the identity, i.e. a section picks out one vector from each fiber. We will for now think of our sections as being C^\infty.

The most important kinds of fields are "scalar" fields, by which I exactly mean a function, i.e. a number at every point. I want to do this because I want to consider other spaces of fields as modules over the ring of scalar fields, so I need to be able to multiply. Of course, there are many times when I don't want a full-fledged scalar field. The potential energy, for instance, is only defined up to a constant: I will eventually need my formalism to accommodate objects that have fields as derivatives, but aren't fields themselves. Since potentials don't care about constants, we could imagine that after going around a circle we measure a different potential energy than we had to begin with, but that we never picked up any force. The string theorists, in fact, need similar objects: locally, string theory looks like (conformal) field theory on the string's worldsheet. But perhaps the string wraps around a small extra dimension? This is why in the previous paragraph I refer to "global" sections: I really ought to allow myself a whole sheaf of fields, understanding that sometimes I want to work with fields that are only defined in a local area. But the physicists are generally clever about this type of problem, so, at the risk of saying things that we might think generalize but actually don't, I'm going to restrict my attention to scalar fields.

In which case, yes, by "scalar field" I mean "function from M \to \R". "A section of M\times\R". "A number at each point". Those who prefer to start with the sheaf of scalar fields will be happy to know that, when I define tangent vectors and their relatives in the next entry, I will start with these scalar fields.