11 November 2006

Liouville's Theorem

A long time ago, in a 2n-dimensional symplectic manifold M, with form \omega_{\mu\nu} (and dual form \omega^{\mu\nu}), far, far away...

Prologue: Our hero, a young Hamiltonian H: M\to\R, defines a "hamiltonian flow" via a vector field (X_H)^\nu = dH_\mu \omega^{\mu\nu}. We can understand H as, for instance, the total energy, and M as the phase space. H has a friend, G, which is preserved by the hamiltonian flow (e.g. momentum in some direction). This happens exactly when X_H[G] = (dG).(X_H) = 0 (thinking of X_H in the first line as a differential operator). But dG.X_H = (dG)_\nu (X_H)^\nu = (X_G)^\mu \omega_{\mu\nu} (X_H)^\nu = (dG)_\nu \omega^{\mu\nu} (dH)_\mu. So H's flow preserves G if and only if G's flow preserves H: being friends is a reflective relationship.

Following the classical mechanists, we say that H and G are "in involution" if indeed \omega(X_H,X_G) = 0. More generally, we can define the "Poisson Bracket" \{H,G\} = \omega(X_H,X_G) = (dH)_\mu (dG)_\nu \omega^{\mu\nu}. Then clearly \{H,G\} = -\{G,H\}, and in particular H preserves itself (energy is conserved). Indeed, \{,\} behaves as a Lie bracket out: it satisfies the Jacobi identity \{\{G,H\},K\} + \{\{H,K\},G\} + \{\{K,G\},H\} = 0, and \{,\} is \R-linear. (Thus C^\infty(M) is naturally a Lie algebra; the corresponding Lie group is the space of "symplectomorphisms", or diffeomorphisms on M that preserve \omega.) Moreover, X_{\{G,H\}} = [X_G,X_H] where [,] is the (canonical) Lie bracket on vector fields. (The Hamiltonian fields X_H are exactly the differentials of symplectomorphisms, hence the identification in the previous parenthetical.)

((Actually, it's not C^\infty(M) that's tangent to the symplecto group of M, but C^\infty(M) / \R, when M is connected. Our \{,\} depends only on the differential of Hamiltonian functions, and so ignores constant terms: there are \R possible constant terms for each connected component of M. We can, of course, equip \R with the trivial Lie algebra, and then C^\infty(M) is, as a Lie algebra, T(symplectomorphisms) \times \R. The physicists would say this by observing that energy is defined only up to a total constant; this constant cannot affect our physics because it appears only in commutators. The physicists try to use this observation to justify introducing infinite constants into their expressions.))

One day our hero H met another function F, but this one unpreserved by H's flow. How does F change? In our setup, where H and F have no explicit time dependence, and we're just flowing via \dot{x} = X_H, we have that dF/dt = X_H[F] = {F,H}.

When H and G are buddies (in involution), then each of H and G is preserved by X_H: the flow stays in the common level set H = H(0) and G = G(0). Assuming that H and G are independent, in the sense that dH and dG are linearly independent (so we're not in the G = H^2 case, for instance), this common level set is (2n-2)-dimensional.

The story: As a young and attractive Hamiltonian, our hero H was particularly popular: there were n-1 other Hamiltonians H_2,...,H_n so that, along with H_1=H, all were pairwise in involution (\{H_i,H_j\} = 0), and all independent (the set of dH_i is linearly independent at each point in M, or at least at each point in some common level set, and so in a neighborhood of that level set). This is the most friends any Hamiltonian can have: the common level set is n-dimensional, and the tangent space contains n independent vectors X_i = X_{H_i}. Because, the X_i spanned each tangent, and because \omega of any two was zero, the common level set was for all to see a Lagrangian submanifold.

Being very good friends, the H_i never got in each other's way. X_i could flow, and X_j could flow, and because of the relationship between Poisson and Lie brackets, their flows always commuted. The gang used this to great affect: by giving each friend an amount to flow, the crowd defined an \R^n action on the common level set. The friends set out to explore this countryside: Hamiltonian flow is volume-preserving, since it preserves the symplectic form (whose nth power is a volume form), and a volume preserving \R-action is onto connected components.

The friends, returning home by some element of the stabilizer subgroup, understood the landscape: the only discrete subgroups of \R^n are lattices, and so the common level set was necessarily a torus (in the compact case). Picking standard coordinates q^j for a torus, the friends observed an isotropy: at every point, X_i = a_i^j \d/\dq^j with a constant "frequency" matrix a.

Our hero's life was solved. If all the frequencies a_1^j were rational multiples of each other, what the Greeks called "commensurate", H's paths were closed. Otherwise, H's flow would be dense in some subtorus, and either way, physics was simple. Indeed, because every Lagrangian manifold has a neighborhood symplectomorphic to its tangent bundle, there were "momentum" coordinates p_i conjugate to the angular position coordinates q^i, and these p_i depended only on the H_j. Indeed, in p,q coordinates, X_i was by definition -\dH_i/\dp_j \d/\dq^j + \dH_i/\dq^j \d/\dp_j, so the H_i knew themselves: H_i = -a_i^j p_j + const.

It was in this way that our hero the Hamiltonian understood how to flow not only in the level set, but in some neighborhood. The friends lived happily ever after.

Epilogue: Sadly, not all Hamiltonians can have as nice a life as our hero, because many do not have so many friends. It has been shown that the three-body problem is not Liouville-integrable, as this property of having enough mutual friends (and hence admitting Lagrangian tori) came to be called. Much analysis has gone into studying perturbations of Liouville systems — weakly-interacting gravitating bodies, for instance — but I do not know this material, and so will not exposit on it here. In my next entry, I hope to speak more about the Poisson bracket, and how it turns classical into quantum systems.

Edit: The matrix a_i^j may depend, of course, on H, or equivalently on p. What is actually true is that, up to a constant, H_i(p) = \int_0^{p} a_i^j(p') dp'_j. It is by solving this equation that one may find the conjugate p coordinates. That H_i = -a_i^j p_j + const. is true only to first order, and to first-order we cannot know whether, for instance, entries in a_i^j remain commensurate, and so whether paths stay closed as the momentum changes. Generically, Hamiltonian flows are not closed, and instead a single path is dense in the entire torus. In the general three-body problem, the flow is dense in a space greater than the dimension of any Lagrangian submanifold.

09 November 2006

Tensors and Hamiltonians

I seem to have fallen way behind in writing about my classes. In particular, it may be a while yet before I do any quantum mechanics; I'm more excited by my classical geometry. But perhaps I will move into the quantum world soon. I almost understand it.

In the last few weeks, my classes have defined forms and fields, integration, chains, Lie groups, and Riemannian manifolds; quantum fields, fermions, SUSY, and Feynman diagrams; structural stability, Anosov flow, and a world worth of material in Yasha Eliashberg's class. Yasha pointed out in one lecture, "It is impossible to be lost in my class, because I keep changing topics every two minutes."

But I'm trying to provide a unified account of such material in these pages, and I last left you only with fields of tangent vectors. So, today, tensors, differential forms, and Hamiltonian mechanics.

Remember where we were: we have a smooth manifold M with local coordinates x^i, over which we can build two extremely important bundles, called T(M) and T^*(M). T(M) is the space of "derivations at a point": on each fiber we have a basis \d/\dx^i and coordinates \dot{x}^i. T^*(M) is dual to T(M) fiberwise: its basis is dx^i and its coordinates are p_i. But from these we can build all sorts of tensor bundles.

I touched on tensors in my last post, but hardly defined them. They are, however, a straightforward construction. A tensor is two vectors set next to each other. Or almost. If I have two vectors v\in V and w\in W, I can take their tensor product v\tensor w: I define \tensor to be multilinear, and that's all. V\tensor W is then generated by all the possible v\tensor w. More precisely, V\tensor W is the universal object so that bilinear maps from V\times W factor through it: there's a canonical bilinear map V\times W \to V\tensor W so that any bilinear from V\times W to \R factors through this map and some linear V\tensor W to \R.

If you haven't seen tensors before, this definition probably only made things worse, so let me say some other words about tensors. (i) The product \tensor is multilinear: (av)\tensor w = a(v\tensor w) = v\tensor(aw), and (v_1 + v_2)\tensor w = v_1\tensor w + v_2\tensor w, and the same on the other side. Thus \R\tensor V is canonically isomorphic to V. (ii) if {v_i} and {w_j} are bases for V and W, then {v_i\tensor w_j} is a basis for V\tensor W. It is in this way that \tensor correctly generalizes \times from Set to Vect.

We will be primarily interested in tensors comprised only of V and V^*, i.e. of vectors and dual vectors. Even more, when we move to bundles, we will be interested only in tensors over T(M) and T*(M). Of course, we can canonically commute V* past V, so all our tensors might as well live in V \tensor ... \tensor V \tensor V* \tensor ... \tensor V*, for various (possibly 0) numbers of Vs and V*s. Some notation: if there are n Vs and m V*s, I will write this (tensor) product as \T^n_m(V). \T^0_0 = \R; \T^1_0 = V.

How should you write these? As birdtracks, a name which Penrose even seems to be adopting. For these, and in my (and many physicists') notation, draw vectors with upward-pointing "arms" (since we write them with raised x^i) and dual vectors with downward-pointing "legs" (indices are lowered). The order of the arms matters, as does the order of the legs, but the canonical commutation referred to in the previous paragraph is explicit. To multiply two vectors, just draw them next to each other; in general, any tensor is the sum of products of vectors, but not necessarily just a product, so in general tensors are just shapes with arms and legs.

Birdtracks are an exquisite way to keep track of what's called "tensor contraction". See, what's important about dual vectors is that they can "eat" vectors and "spit out" numbers: there is a canonical pairing from V\tensor V* = \T^1_1 \to \R. We can generalize this contraction to any tensors, if we just say which arm eats which leg. In these notes, drawing birdtracks is hard; I will use the almost-as-good notation of raised and lowered indices. We can define \T^{-1} as being basically the same as \T_1, except that it automatically contracts in tensor products; this breaks associativity.

So, our basic objects are sections of \T^n_m(T(M)), by which I mean fields of tensors. A few flavors of tensors deserve special mention.

To \T^n(V) we can impose various relationships. In particular, there are two important projection operators, Sym and Ant. There is a canonical action of S_n on \T^n(V): \pi\in S_n sends a basis element e_{i_1}\tensor...\tensor e_{i_n} to e_{\pi(i_1}\tensor...\tensor e_{\pi(i_j)}. Extending this action linearly, we can construct Sym and Ant by
Sym(\omega) = (1/n!) \sum_{\pi\in S_n} \pi(\omega)
Ant(\omega) = (1/n!) \sum_{\pi\in S_n} \sgn(\pi) \pi(\omega)

where \sgn(\pi) is 1 if \pi is an even permutation, -1 if it is odd. These definitions, of course, also work for T_n. These are projection operators — Ant^2 = Ant and Sym^2 = Sym — so we can either quotient by their kernels or just work in their images, it doesn't matter. Define \S^n = Sym(\T^n) and \A^n = Ant(\T^n), and similarly for lowered indices. We have symmetric and "wedge" (antisymmetric) multiplication by, e.g., \alpha\wedge\beta = Ant(\alpha\tensor\beta); each is associative. One can immediately see that, if dim(V)=k, then dim(\S^n(V)) = \choose{k+n}{n} and dim(\A^n(V)) = \choose{k}{n}; of course, dim(\T^n) = k^n. Of particular importance: \T^2 = \S^2 + \A^2, where I will always use "+" between vector spaces to simply mean "direct sum" (which correctly generalizes disjoint union of bases).

From now on, V will always be T(M), and I'm now interested in fields. I will start using \T^_, \A, and \S to refer to the spaces of fields. We will from time to time call on tensors fields in \S_n, but those in \A_n end up being more important: we can understand them as differential forms. I may in a future entry try to understand differential forms better; for now, the following discussion suffices.

To each function f\in C^\infty(M) = \T_0 = \A_0 we can associate a canonical "differential": remembering that v\in \T^1 acts as a differential operator, we can let df\in\T_1 be the dual vector (field) so that df_i v^i = v(f). Even more simply, f:M\to\R, so it has a ("matrix") derivative Df: TM\to T\R. But T\R is trivial, and indeed we can canonically identify each fiber just with \R. So df = Df composed with this projection T\R \to \R along the base (preserving only the fiber). In coordinates, df = \df/\dx^i dx^i. It's tempting to follow the physicists and write d_i = \d/\dx^i, since the set of these "coordinate vector fields" "transforms as a dual vector".

From this, we can build the "exterior derivative" d:\A_i\to\A_{i+1} by declaring that d(df) = 0, and that if \alpha\in\A_r and \beta\in\A_s, then d(\alpha\wedge\beta) = d\alpha \wedge \beta + (-1)^i \alpha \wedge d\beta. This will not be important for the rest of my discussion today; I may revisit it. But I may decide that I prefer the physicists' \d_i = \d/\dx^i which acts on tensors of any stripe and satisfies the normal Leibniz rule. We'll see.

So, what can we do with tensors? We could pick a nondegenerate positive definite element of S_2; such a critter is called a "Riemannian metric". I won't talk about those right now (I may come back to them in a later entry). Instead, I'm interested in their cousins, symplectic forms: nondegenerate fields \omega in \A_2. By nondegenerate, I mean that for any non-zero vector y there's a vector x so that \omega(x,y)\neq 0. Thus, symplectic forms, live metrics, give isomorphisms between V and V*, and so \omega_{ij} has an inverse form \omega^{ij}\in\A^2 so that \omega_{ij}\omega^{jk} = \delta_i^k \in \A_1^1 = \T_1^1.

Some observations about symplectic forms are immediate, and follow just from linear algebra. (i) Symplectic forms only live in even-dimensions. (ii) We can find local coordinates p_i and q^i so that \omega = dp_i\wedge dq^i. (iii) In 2n-dimensional space, the form (\omega)^n \in \A_{2n} is a nowhere-zero volume form on space, so symplectic forms only live in orientable manifolds.

It's observation (ii) that gives us our first way of tackling symplectic forms, because we have a canonical example: the cotangent bundle of a manifold. If dim(M) = n, then T*M is 2n-dimensional and has a canonical form \omega = dp_i \wedge dq^i, where the q^i are "position" coordinates in M and the p_i are the "conjugate momentum" coordinates in the fibers. You can show that this coordinate formula for \omega transforms covariantly with changes of coordinates, so the form is well-defined; better is an invariant geometric meaning. And we can construct one. Let v be a vector in T(T*M), based at some point x,y\in T*M, i.e. x\in M and y is a covector at x. Then \pi: T*M \to M projects down along fibers, so we can pushforward v to \pi_*(v) \in TM. Now let y act on this tangent vector. This defines a form \alpha\in T*(T*M) = \A_1 (T*M): \alpha(v_{x,y}) = y.\pi_*(v). In local coordinates, \alpha = p_i dq^i. Then we can differentiate \alpha to get \omega = d\alpha \in \A_2; one can show that d\alpha is everywhere nondegenerate.

Why do we care? Well, let's say we're solving a mechanics problem of the following form: we have a system, in which the position of a state is given by some generalized coordinates q^i, and momentum by p_i. Then the total phase space is T(position space). And to each state, let's assign a total energy H = p^2/2m + U(q). We're thinking of p^2/2m as the "kinetic energy" and U(q) as the "potential energy". More generally, we could imagine replacing p^2/2m by any (positive definite symmetric) a^{ij} p_i p_j / 2; of course, we ought to pick coordinates to diagonalize a^{ij}, but c'est la vie. (And, of course, a^{ij} may depend on q.) Then our dynamics should be given by
\dot{q} = p/m = \dH/\dp
\dot{p} = -\dU/\dq = -\dH/\dq

How can we express this in more universal language? A coordinate-bound physicists might be content writing \dot{q}^i = a^{ij}p_j and \dot{p}_i = -\dU/\dq^i. But what's the geometry? Well, we want this vector field X_H = (\dot{p},\dot{q})\in \T^1 = \A^1, and we have the derivative dH \in \A_1 = \T_1. It turns out that the relationship is exactly that for any other v-field Y, dH.Y = \omega(X_H,Y). I.e. dH is X_H contracted with \omega. For the physicists, letting \mu index the coordinates in T*M, we have (X_H)^\mu = \omega^{\mu\nu} \dH/\dx^\mu.

This is an important principle: the Hamiltonian, which knows only the total energy of a state, actually knows everything about the dynamics of the system (provided that the system know which momenta are conjugate to which positions).

From here, we could go in a number of directions. One physics-ish question that deeply interests me is why our universe has this physics. The conventional story is that God created a position space, and that inherent to the notion of "position" is the notion of "conjugate momentum", and thus it would be natural to create a dynamics like we have. But what's entirely unclear to me is why our physics should be second-order at all. Wouldn't it be much more natural for God to start with just a manifold of states and a vector-field of dynamics? Perhaps God felt that a function is simpler than a vector field. But that's no matter: God could have just picked a manifold with a function and any nondegenerate element of \T^2, with which to convert d(the function) into the appropriate field. For instance, we could have had a Riemannian manifold with dynamics given by "flow in the direction of the gradient of some function".

No, instead God started with a symplectic manifold. Well, that's fine; I don't begrudge the choice of anti-symmetry. But then there's a really deep mystery: why, if fundamentally our state is given by a point in some symplectic manifold, do we distinguish between position and momentum? Certainly we can always find local coordinates in which \omega = dp\wedge dq, but for God there's some rotation between p- and q-coordinates that we don't see. The symmetry is broken.

Another direction we could go is to discuss this physics from the opposite direction, and I certainly intend to do so in a future entry. As many of you probably know, alternate to the Hamiltonian formalism is a similarly useful Lagrangian formalism for mechanics. Along with its "least action" principle, Lagrangian mechanics is a natural setting in which to understand Noether's Theorem and Feynman's Sum-Over-Paths QFT. At the very least, sometime soon I will try to explain the relationship between Lagrangians and Hamiltonians; it will draw deeply on such far-afield ideas as projective geometry.

But I think that in my next entry I will pursue yet another direction. I'd like to talk about Liouville's Theorem, which tells you how to solve Hamiltonian diffeqs, by introducing a bracket between Hamiltonian functions. I hope that I will be able to then explain how this relates to QFT and its canonical commutation relations. This last step I don't fully understand yet, but I hope to: I think it's the last bit I need to know before I can understand what it means to "quantize" a classical theory.

Edit: In addition to symplectic forms being nondegenerate antisymmetric, they must also be exact: their (exterior) derivatives should be zero. Most immediately, this fact assures that the symplectic forms be locally equivalent, and, most basically, this allows us to talk about (co)homology of a symplectic. In a more advanced setting, exactness will translate into Jacobi's identity.