09 November 2006

Tensors and Hamiltonians

I seem to have fallen way behind in writing about my classes. In particular, it may be a while yet before I do any quantum mechanics; I'm more excited by my classical geometry. But perhaps I will move into the quantum world soon. I almost understand it.

In the last few weeks, my classes have defined forms and fields, integration, chains, Lie groups, and Riemannian manifolds; quantum fields, fermions, SUSY, and Feynman diagrams; structural stability, Anosov flow, and a world worth of material in Yasha Eliashberg's class. Yasha pointed out in one lecture, "It is impossible to be lost in my class, because I keep changing topics every two minutes."

But I'm trying to provide a unified account of such material in these pages, and I last left you only with fields of tangent vectors. So, today, tensors, differential forms, and Hamiltonian mechanics.

Remember where we were: we have a smooth manifold M with local coordinates x^i, over which we can build two extremely important bundles, called T(M) and T^*(M). T(M) is the space of "derivations at a point": on each fiber we have a basis \d/\dx^i and coordinates \dot{x}^i. T^*(M) is dual to T(M) fiberwise: its basis is dx^i and its coordinates are p_i. But from these we can build all sorts of tensor bundles.

I touched on tensors in my last post, but hardly defined them. They are, however, a straightforward construction. A tensor is two vectors set next to each other. Or almost. If I have two vectors v\in V and w\in W, I can take their tensor product v\tensor w: I define \tensor to be multilinear, and that's all. V\tensor W is then generated by all the possible v\tensor w. More precisely, V\tensor W is the universal object so that bilinear maps from V\times W factor through it: there's a canonical bilinear map V\times W \to V\tensor W so that any bilinear from V\times W to \R factors through this map and some linear V\tensor W to \R.

If you haven't seen tensors before, this definition probably only made things worse, so let me say some other words about tensors. (i) The product \tensor is multilinear: (av)\tensor w = a(v\tensor w) = v\tensor(aw), and (v_1 + v_2)\tensor w = v_1\tensor w + v_2\tensor w, and the same on the other side. Thus \R\tensor V is canonically isomorphic to V. (ii) if {v_i} and {w_j} are bases for V and W, then {v_i\tensor w_j} is a basis for V\tensor W. It is in this way that \tensor correctly generalizes \times from Set to Vect.

We will be primarily interested in tensors comprised only of V and V^*, i.e. of vectors and dual vectors. Even more, when we move to bundles, we will be interested only in tensors over T(M) and T*(M). Of course, we can canonically commute V* past V, so all our tensors might as well live in V \tensor ... \tensor V \tensor V* \tensor ... \tensor V*, for various (possibly 0) numbers of Vs and V*s. Some notation: if there are n Vs and m V*s, I will write this (tensor) product as \T^n_m(V). \T^0_0 = \R; \T^1_0 = V.

How should you write these? As birdtracks, a name which Penrose even seems to be adopting. For these, and in my (and many physicists') notation, draw vectors with upward-pointing "arms" (since we write them with raised x^i) and dual vectors with downward-pointing "legs" (indices are lowered). The order of the arms matters, as does the order of the legs, but the canonical commutation referred to in the previous paragraph is explicit. To multiply two vectors, just draw them next to each other; in general, any tensor is the sum of products of vectors, but not necessarily just a product, so in general tensors are just shapes with arms and legs.

Birdtracks are an exquisite way to keep track of what's called "tensor contraction". See, what's important about dual vectors is that they can "eat" vectors and "spit out" numbers: there is a canonical pairing from V\tensor V* = \T^1_1 \to \R. We can generalize this contraction to any tensors, if we just say which arm eats which leg. In these notes, drawing birdtracks is hard; I will use the almost-as-good notation of raised and lowered indices. We can define \T^{-1} as being basically the same as \T_1, except that it automatically contracts in tensor products; this breaks associativity.

So, our basic objects are sections of \T^n_m(T(M)), by which I mean fields of tensors. A few flavors of tensors deserve special mention.

To \T^n(V) we can impose various relationships. In particular, there are two important projection operators, Sym and Ant. There is a canonical action of S_n on \T^n(V): \pi\in S_n sends a basis element e_{i_1}\tensor...\tensor e_{i_n} to e_{\pi(i_1}\tensor...\tensor e_{\pi(i_j)}. Extending this action linearly, we can construct Sym and Ant by
Sym(\omega) = (1/n!) \sum_{\pi\in S_n} \pi(\omega)
Ant(\omega) = (1/n!) \sum_{\pi\in S_n} \sgn(\pi) \pi(\omega)

where \sgn(\pi) is 1 if \pi is an even permutation, -1 if it is odd. These definitions, of course, also work for T_n. These are projection operators — Ant^2 = Ant and Sym^2 = Sym — so we can either quotient by their kernels or just work in their images, it doesn't matter. Define \S^n = Sym(\T^n) and \A^n = Ant(\T^n), and similarly for lowered indices. We have symmetric and "wedge" (antisymmetric) multiplication by, e.g., \alpha\wedge\beta = Ant(\alpha\tensor\beta); each is associative. One can immediately see that, if dim(V)=k, then dim(\S^n(V)) = \choose{k+n}{n} and dim(\A^n(V)) = \choose{k}{n}; of course, dim(\T^n) = k^n. Of particular importance: \T^2 = \S^2 + \A^2, where I will always use "+" between vector spaces to simply mean "direct sum" (which correctly generalizes disjoint union of bases).

From now on, V will always be T(M), and I'm now interested in fields. I will start using \T^_, \A, and \S to refer to the spaces of fields. We will from time to time call on tensors fields in \S_n, but those in \A_n end up being more important: we can understand them as differential forms. I may in a future entry try to understand differential forms better; for now, the following discussion suffices.

To each function f\in C^\infty(M) = \T_0 = \A_0 we can associate a canonical "differential": remembering that v\in \T^1 acts as a differential operator, we can let df\in\T_1 be the dual vector (field) so that df_i v^i = v(f). Even more simply, f:M\to\R, so it has a ("matrix") derivative Df: TM\to T\R. But T\R is trivial, and indeed we can canonically identify each fiber just with \R. So df = Df composed with this projection T\R \to \R along the base (preserving only the fiber). In coordinates, df = \df/\dx^i dx^i. It's tempting to follow the physicists and write d_i = \d/\dx^i, since the set of these "coordinate vector fields" "transforms as a dual vector".

From this, we can build the "exterior derivative" d:\A_i\to\A_{i+1} by declaring that d(df) = 0, and that if \alpha\in\A_r and \beta\in\A_s, then d(\alpha\wedge\beta) = d\alpha \wedge \beta + (-1)^i \alpha \wedge d\beta. This will not be important for the rest of my discussion today; I may revisit it. But I may decide that I prefer the physicists' \d_i = \d/\dx^i which acts on tensors of any stripe and satisfies the normal Leibniz rule. We'll see.

So, what can we do with tensors? We could pick a nondegenerate positive definite element of S_2; such a critter is called a "Riemannian metric". I won't talk about those right now (I may come back to them in a later entry). Instead, I'm interested in their cousins, symplectic forms: nondegenerate fields \omega in \A_2. By nondegenerate, I mean that for any non-zero vector y there's a vector x so that \omega(x,y)\neq 0. Thus, symplectic forms, live metrics, give isomorphisms between V and V*, and so \omega_{ij} has an inverse form \omega^{ij}\in\A^2 so that \omega_{ij}\omega^{jk} = \delta_i^k \in \A_1^1 = \T_1^1.

Some observations about symplectic forms are immediate, and follow just from linear algebra. (i) Symplectic forms only live in even-dimensions. (ii) We can find local coordinates p_i and q^i so that \omega = dp_i\wedge dq^i. (iii) In 2n-dimensional space, the form (\omega)^n \in \A_{2n} is a nowhere-zero volume form on space, so symplectic forms only live in orientable manifolds.

It's observation (ii) that gives us our first way of tackling symplectic forms, because we have a canonical example: the cotangent bundle of a manifold. If dim(M) = n, then T*M is 2n-dimensional and has a canonical form \omega = dp_i \wedge dq^i, where the q^i are "position" coordinates in M and the p_i are the "conjugate momentum" coordinates in the fibers. You can show that this coordinate formula for \omega transforms covariantly with changes of coordinates, so the form is well-defined; better is an invariant geometric meaning. And we can construct one. Let v be a vector in T(T*M), based at some point x,y\in T*M, i.e. x\in M and y is a covector at x. Then \pi: T*M \to M projects down along fibers, so we can pushforward v to \pi_*(v) \in TM. Now let y act on this tangent vector. This defines a form \alpha\in T*(T*M) = \A_1 (T*M): \alpha(v_{x,y}) = y.\pi_*(v). In local coordinates, \alpha = p_i dq^i. Then we can differentiate \alpha to get \omega = d\alpha \in \A_2; one can show that d\alpha is everywhere nondegenerate.

Why do we care? Well, let's say we're solving a mechanics problem of the following form: we have a system, in which the position of a state is given by some generalized coordinates q^i, and momentum by p_i. Then the total phase space is T(position space). And to each state, let's assign a total energy H = p^2/2m + U(q). We're thinking of p^2/2m as the "kinetic energy" and U(q) as the "potential energy". More generally, we could imagine replacing p^2/2m by any (positive definite symmetric) a^{ij} p_i p_j / 2; of course, we ought to pick coordinates to diagonalize a^{ij}, but c'est la vie. (And, of course, a^{ij} may depend on q.) Then our dynamics should be given by
\dot{q} = p/m = \dH/\dp
\dot{p} = -\dU/\dq = -\dH/\dq

How can we express this in more universal language? A coordinate-bound physicists might be content writing \dot{q}^i = a^{ij}p_j and \dot{p}_i = -\dU/\dq^i. But what's the geometry? Well, we want this vector field X_H = (\dot{p},\dot{q})\in \T^1 = \A^1, and we have the derivative dH \in \A_1 = \T_1. It turns out that the relationship is exactly that for any other v-field Y, dH.Y = \omega(X_H,Y). I.e. dH is X_H contracted with \omega. For the physicists, letting \mu index the coordinates in T*M, we have (X_H)^\mu = \omega^{\mu\nu} \dH/\dx^\mu.

This is an important principle: the Hamiltonian, which knows only the total energy of a state, actually knows everything about the dynamics of the system (provided that the system know which momenta are conjugate to which positions).

From here, we could go in a number of directions. One physics-ish question that deeply interests me is why our universe has this physics. The conventional story is that God created a position space, and that inherent to the notion of "position" is the notion of "conjugate momentum", and thus it would be natural to create a dynamics like we have. But what's entirely unclear to me is why our physics should be second-order at all. Wouldn't it be much more natural for God to start with just a manifold of states and a vector-field of dynamics? Perhaps God felt that a function is simpler than a vector field. But that's no matter: God could have just picked a manifold with a function and any nondegenerate element of \T^2, with which to convert d(the function) into the appropriate field. For instance, we could have had a Riemannian manifold with dynamics given by "flow in the direction of the gradient of some function".

No, instead God started with a symplectic manifold. Well, that's fine; I don't begrudge the choice of anti-symmetry. But then there's a really deep mystery: why, if fundamentally our state is given by a point in some symplectic manifold, do we distinguish between position and momentum? Certainly we can always find local coordinates in which \omega = dp\wedge dq, but for God there's some rotation between p- and q-coordinates that we don't see. The symmetry is broken.

Another direction we could go is to discuss this physics from the opposite direction, and I certainly intend to do so in a future entry. As many of you probably know, alternate to the Hamiltonian formalism is a similarly useful Lagrangian formalism for mechanics. Along with its "least action" principle, Lagrangian mechanics is a natural setting in which to understand Noether's Theorem and Feynman's Sum-Over-Paths QFT. At the very least, sometime soon I will try to explain the relationship between Lagrangians and Hamiltonians; it will draw deeply on such far-afield ideas as projective geometry.

But I think that in my next entry I will pursue yet another direction. I'd like to talk about Liouville's Theorem, which tells you how to solve Hamiltonian diffeqs, by introducing a bracket between Hamiltonian functions. I hope that I will be able to then explain how this relates to QFT and its canonical commutation relations. This last step I don't fully understand yet, but I hope to: I think it's the last bit I need to know before I can understand what it means to "quantize" a classical theory.

Edit: In addition to symplectic forms being nondegenerate antisymmetric, they must also be exact: their (exterior) derivatives should be zero. Most immediately, this fact assures that the symplectic forms be locally equivalent, and, most basically, this allows us to talk about (co)homology of a symplectic. In a more advanced setting, exactness will translate into Jacobi's identity.