07 June 2009

In which we find out who uses RSS

This blog, as you probably have gathered, has been unused for some time. It will probably return to being unused soon. But for those of you who have it in your RSS feed, so that you are automatically alerted to new posts, I wanted to give a brief update.

I am happily in graduate school at UC Berkeley, studying quantum field theory. My quals are in less than a week, so I am reading and reviewing. This term I discovered that my favorite way to study is to write copious amounts. If you like to read, I invite you to check out the following:

  1. Lie Groups and Lie Algebras (pdf). These are edited lecture notes from a one-semester class last fall. They should be fairly complete and accurate, but please let me know about any errors: typos are easy to fix, and mathematical errors should be corrected for morality sake.
  2. Poisson Lie linear algebra in the graphical language (pdf). This article outlines all the definitions from the theory of Lie bialgebras, but does it using only Penrose's ``birdtrack'' notation, championed by among others Cvitanovic.
  3. Some notes on Quantum Groups (pdf). These are not particularly edited, and are rather ideosyncratic, and include the graphical language article as a chapter.


I hope to say a few words here about some of this material. In particular, I'm thinking of writing about the following: the Weyl character formula; symplectic leaves of Poisson manifolds; what is quantization. Whether I actually write such entries is still up in the air, of course. Currently, I primarily post content directly to my website, but that doesn't have an RSS feature. If there are still people who watch this space, leave a comment, and I'll get into the habit of writing a blog entry any time I post something new to my website.

03 March 2008

From products of infinite sums to subtraction and division

In some of the research I'm doing, I have occasion to write expressions like

A + ABA + ABABA + ABABABA + \dots = (A^{-1} - B)^{-1}

with no consideration of convergence.

Although in my setting I have (multiplicative) inverses and (additive) negatives, it nonetheless becomes interesting to ask what sorts of operations we can do with just plus and times, provided that we can do such things infinitely.

For example, with just positive integers and some questionable summations, we can get all negative fractions:

-1 = 1 + 2 + 4 + 8 + 16 + \dots
-m/n = m + (n+1)\times m + (n+1)^2\times m + \dots

and some irrationals:

\sqrt{2\pi} = 1 \times 2 \times 3 \times 4 \times 5 \times \dot

The first two sums are from the fact that 1+r+r^2+\dots=1/(1-r). Indeed, this comes from multiplying and shifting. If, e.g., we add 1 to the first equation, and then telescope the sum, we have

1+1 + 2 + 4 + 8 + 16 + \dots = 2+2 + 4 + 8 + 16 + \dots = 4+4 + 8 + 16 + \dots = 8+8 + 16 + \dots = etc.

The product is from \zeta-function regularization: we recognize the right hand side as the exponential of d\zeta(s)/ds at s=0.

Since we can multiply power series, we can get all fractions; for any given fraction, there are lots of ways of writing it as a series. For example,

1 = (-1)^2 = (1+2+4+8+\dots)^2 = 1 + (2+2) + (4+4+4) + (8+8+8+8) + \dots
= \sum (n+1)2^n.

Since \sum 2^n = -1, we clearly have \sum n2^{n-1} = 1 as well, which can be checked directly (it is d[\sum 2^n]/d2 = d[1/(1-2)]/d2 = 1/(1-2)^2.).

How about inverses of variables? Especially if you're not sure of the 2-adic sum, perhaps you'd rather use

-1/X = 1 + (1+X) + (1+X)^2 + (1+X)^3 + \dots

to get your negative and inverse Xs. This formula is hard to verify, since it still requires the same shifting tricks:

1 + ( X + X(1+X) + X(1+X)^2 + \dots ) = 1 + X + X(1+X) + X(1+X)^2 + \dots
= 1+X + X(1+X) + X(1+X)^2 + \dots
= (1+X)(1+X) + X(1+X)^2 + \dots
= (1+X)(1+X)^2 + \dots
= etc.

See, infinite sums are not associative --- the distance between terms matters --- but these manipulations are justified since infinite sums are finitely associative.

We get, either by squaring or differentiating,

1/X^2 = 1 + 2(1+X) + 3(1+X)^2 + 4(1+X)^3 + \dots

and hence all powers of X. Well, we get \pm 1/X^n; for, say, -X, the shortest is to use X^2(-1/X) = X^2 + X^2(1+X) + X^2(1+X)^2 + X^2(1+X)^3 + \dots, which telescopes in the same way.

These manipulations require that there be a number 1 which multiplies with X to X and which we can add to X. Which is true is X is, say, a matrix, or a CW complex, but not if X is a two-zero-tensor. If X is such a tensor, then X^{-1} is the tensor which convolves on both sides to the identity matrix. This is the setting of my original line (A is two-zero, and B is zero-two). In this situation, I don't have a good construction of X^{-1}.

12 February 2008

New Blog

If you do not subscribe to the RSS feed for this blog, then you have by now stopped checking it for updates. I intend to post occasionally, but school is absorbing my mathematical thoughts, rather than writing.

Since I have no extra time, I have started a new blog, sister to this one. Whereas Orange Juice Files is primarily for mathematical (and nonmathematical) essays, Local Seasoning will be a repository of recipes and meals from my kitchen. Enjoy.

08 January 2008

A composition law for discrete-time QM

One of the best ways to understand quantum mechanics — bear with me — is as a one-dimensional quantum field theory. No, it's not backwards, and we really should think of QM as inherently one-dimensional: there's one dimension of time. The configuration space of a particle is finite-dimensional; the size of the space of paths the particle could take — and Feynman says that a particle takes every possible path — is largely determined by number of "time" dimensions in the problem, since a path has a point in configuration space for every moment in time.

In any case, I'm going to think of it that way. And then I'm going to think about path integration: the transition amplitude between two configurations is some poorly-defined integral over the infinite-dimensional space of paths connecting those configurations. Rather than trying to compute in infinite dimensions, physicists since Feynman have formally expanded the integral asymptotically, and interpreted the coefficients combinatorially as diagrams. (And interpreted the diagrams as describing actual events, which is a matter for them physicists rather than for us mathematicians to discuss.)

These formal power series should — indeed, must, if the formalism is to make sense — satisfy a particular gluing axiom. But it seems that no one has gone to the trouble to verify that in fact they do; there is no proof available in the literature. In his fall-semester QFT class, Nicolai Reshetikhin suggested that someone try to fix this, and I volunteered. Silly me.

Long story short, after making progress on a related issue, and then avoiding the project for more than a month, I've worked the 0-dimensional analogue, where we approximate the time interval by a sequence of (equally-spaced) discrete points. If you would like to read about it, the paper is available here.

Update: The second, and likely final, installment is now available. It concludes with the gluing rules for arbitrary perturbative quantum field theories.

27 December 2007

On Presidential Primaries

One of the crimes of our current electoral system — the United States picks its President in about as undemocratic a method imaginable — is that the same four million people pick the candidates for both major parties every year.

Wikipedia lists the fifty states' and six non-state U.S. territories' populations, based on official estimates by the U.S. Census Bureau. In light of next month's high-profile primary elections, the numbers are remarkable.

Elections are very expensive: candidates for primaries must go door-to-door, and raise money for media buys. It would be highly undemocratic for California to be the first primary. Eventually, the nominee must win California, but s/he should not have to compete there until most of the candidates have been culled from the game, and fundraising is funneled to only a few folks who've had lots of free media. Indeed, if California was the first primary, you can be sure that only the candidates who come into the game with big names and lots of personal money would be competitive. (Perhaps, of course, this is better? I wouldn't mind a system that rewarded career politicians who worked their way to the top after long terms in the Senate, Cabinet, and state Governor mansions.) So we must, if we are to have some semblance of a meritocratic democracy, begin the nominating contest in small states.

With that said, the current choices of Iowa, New Hampshire, and South Carolina cannot be justified.

Iowa, with just shy of three million people, is not large. It has five members of the House of Representatives, and its population is largely white, pro-farm, and anti-immigrant. New Hampshire is legitimately small: it has only 1.3 million people and two representative in Congress. South Carolina, on the other hand, houses more than four million people, and is the only early primary with a sizable non-white population.

But Iowa is larger than twenty other states (and the District of Columbia), any of whom would make a reasonable first-in-the-nation primary. From smallest to largest, they are:

Wyoming, DC, Vermont, North Dakota, Alaska, South Dakota, Delaware, Montana, Rhode Island, Hawaii, New Hampshire, Maine, Idaho, Nebraska, West Virginia, New Mexico, Nevada, Utah, Kansas, Arkansas, and Mississippi.

These twenty one voting districts represent the full range of American demographics. DC is urban, Black, and overwhelmingly Democratic. Hawaii is majority Asian; New Mexico has a large Mexican immigrant population. Many of the small states are rural and conservative, while others, like Rhode Island and Delaware, easily represent the metropolitan North East. Vermont has a bizarre politics all to its own.

The 2008 election cycle will begin with the same four million people that have started the game for the last twenty years. Four million people whom the candidates, when they are in their most pandering mode, call "uniquely qualified" to pick a candidate. Are the rest of us stupid and uneducated? Are North Dakotans or Hawaiians somehow less able to consider the candidates' experiences and abilities? A democratic system should not enfranchise only those who are educated, or moneyed, or otherwise "qualified" to vote.

In a better system, a bipartisan, independent commission, taking input from major party leaders, would select the Primary order each cycle. Their selections should, by law or policy, begin always with a couple small states — say, two states from the twenty smallest. Criteria should include some complementarity: one liberal, urban population, with one rural conservative one, say. And, most importantly, their selections should be different from year to year. In one cycle, Vermont and Idaho start off; in another, Utah and Delaware.

There have been many complaints this year about the election starting too early. It has been an expensive year (see, for instance, the listings at opensecrets.org): Clinton and Obama have each raised close to one hundred million dollars and spend forty million of it; Romney has raised and spent around sixty million. The general election will probably cost each party around a billion. And it's exhausting, and a distraction from the important work of running the country. Oregon's Governor, for instance, is out in Iowa stumping for Clinton, rather than doing his gubernatorial work. Will the 2012 election begin similarly early? It wouldn't if the Primary committee waited to announce the order, and perhaps was empowered also to announce the dates.

A system like this — or, indeed, any proposal to reform the primaries — would most likely require, in order to go into effect, the commitment of both parties as well as Congressional involvement. That's a tough order, but a necessary one if we are going to have a truly democratic democracy.

13 December 2007

Linear Differential Equations

In the calculus class I'm TAing, we spent some time learning how "the method of undetermined coefficients" could be used to solve linear differential equations. I have never taken a first-year differential equations class, so although I'd solved many differential equations this way, I had never really though about such methods with any real theory. My goal in this entry is to describe the method and explain why it works, using more sophisticated ideas than in a first-year class, but still remaining very elementary. I had hoped to have this written and posted weeks ago; as it is, I'm writing it while proctoring the final exam for the class.

First, let me remind you of the set-up of the problem. We are trying to solve a non-homogeneous differential equation with constant coefficients:

$$ a_n y^{(n)} + \dots + a_1 y' + a_0 y = g(x) $$

I will assume that $a_n$ is not 0; then the left-hand side defines a linear operator $D$ of on the space of functions, with $n$-dimensional kernel.

We can diagonalize this operator by Fourier-transforming: if $y = e^{rx}$, then $D[y] = a(r) e^{rx}$, where $a(r)$ is the polynomial $a_n r^n + \dots + a_1 r + a_0$. If $a(r)$ has no repeated roots, then we can immediately read off a basis for the kernel as $e^{rx}$ for $r$ ranging over the $n$ roots. If there is a repeated root, then $a'(r)$ and $a(r)$ share a common factor; $a'(r)$ corresponds to the operator

$$ E[y] = a_n n y^{(n-1)} + \dots + a_1 $$

Then, since $\frac{d^k}{dx^k} [x y] = x y^{(k)} + k y^{(k-1)}$, we see that

$$ D[x y] = x D[y] + E[y] $$

so $r$ is a repeated root of $a(r)$ if and only if $e^{rx}$ and $x e^{rx}$ are zeros of $D$.

More generally, linear differential operators with constant coefficients satisfy a Leibniz rule:

$$ D[p(x) e^{rx}] = \left( p(x) a(r) + p'(x) a'(r) + \dots + p^{(n)}(x) a^{(n)}(r) \right) e^{rx} $$

for polynomial $p(x)$.

Thus, our ability to solve linear homogeneous differential equations with constant coefficients depends exactly on our ability to factor polynomials; for example, we can always solve second-order equations, by using the quadratic formula.

But, now, what if $g(x) \neq 0$? I will assume that, through whatever conniving factorization methods we use, we have found the entire kernel of $D$. Then our problem will be solved if we can find one solution to $D[y] = g(x)$; all other solutions are the space through this function parallel to the kernel. In calculus, we write this observation as "$y_g = y_c + y_p$" where g, c, and p stand for "general", "[c]homogenous", and "particular", respectively.

For a general $g$, we can continue to work in the Fourier basis: Fourier transform, sending $D[y] \mapsto a(r)$, then divide and integrate to transform back. This may miss some solutions at the poles, and is computationally difficult: integrating is as hard as factoring polynomials. For second-order equations, we can get to "just an integral" via an alternative method, by judiciously choosing how to represent functions in terms of the basis of solutions for $D[y]=0$.

But for many physical problems, $g(x)$ is an especially simple function (and, of course, it can always be Fourier-transformed into one).

In particular, let's say that $g(x)$ is, for example, a sum of products of exponential (and sinosoidal) and polynomial functions. I.e. let's say that $g(x)$ is a solution to some homogeneous linear constant-coefficient $C[g(x)] = 0$. Another way of saying this: let's say that the space spanned by all the derivatives $g$, $g'$, $g''$, etc., is finite-dimensional. If it has dimension $= m$, then $g^{(m)}(x)$ is a linear combination of lower-order derivatives, and thus I can find a minimal (order) $C$ of degree $m$ so that $C[g] = 0$. By the Leibniz rule, functions with this property form a ring. When I add two functions, the dimensions of their derivative spaces no more than add; when I multiply, the dimensions no worse than multiply. Indeed, by the earlier discussion, we have an exact description of such functions: they are precisely sums of products of $x^s e^{rx}$ ($s$ an non-negative integer).

In any case, let's say $g$ is of this form, i.e. we have $C$ (with degree $m$) minimal such that $C[g] = 0$, and first let's assume that the kernels of $C$ and $D$ do not intersect. Then $D$ acts as a one-to-one linear operator on finite-dimensional space $\ker C$, which by construction is spanned by the derivatives of $g$, and so must be onto. I.e. there is a unique point $y_p(x) = b_0 g(x) + b_1 g'(x) + \dots + b_{m-1} g^{(m-1)}(x) \in \ker C$ so that $D[y_p] = g$. Finding it requires only solving the system of linear equations in the $b_i$.

If, however, $\ker C$ and $\ker D$ intersect, then we will not generically be able to solve this system of linear equations. Because $C$ is minimal, $g$ is not a linear combination of fewer than $m$ of its derivatives; if $D$ sends (linear combinations of) some of those derivatives to 0, we will never be able to get a map onto $g$. Let me say this again. If $D$ does not act one-to-one on $\ker C$, but $g$ is in the range of this matrix, then $g$ is in a smaller-than-$m$-dimensional space closed under a differential operator; thus, there is a differential operator of lower-than-$m$ degree that annihilates $g$.

How can we then solve the equation? By the Leibniz rule, we observed earlier, $(xg)' = g + x g'$, and so

$$\frac{d^k}{dx^k}[x g(x)] = x g^{(k)}(x) + k g^{(k-1)}(x)$$

Then $C[xg]$ is a linear combination of derivatives of $g$; i.e. $C[xg] \in \ker C$. If we take the space spanned by the derivatives of $x g(x)$, it is one dimension larger than $\ker C$. We can repeat this trick ---

$$ \frac{d^k}{dx^k}[ x^p g(x) ] = \sum_{i=0}^k \frac{k!p!}{i!(k-i)!(p-k+i)!} x^{p-k+i} g^{(i)}(x) $$

--- and eventually get a space that's $n+m$ dimensional, containing, among other things, $g$, and closed under differentiation. The kernel of $D$ in this larger space is at most $n$ dimensional (since $n = \dim \ker D$ in the space of all functions), and so the range is $m$-dimensional, and must contain $g$: the system of linear equations is solvable.

Of course, we often can stop before getting all the way to $n$ extra dimensions. But so long as we are only interested in functions that are zeros of constant linear differential operators, then we can always solve differential equations. For example, every linear equation from a physics class, and almost every one in first-year calculus, is solvable with this method.

One final remark:

Composition of differential operators follows matrix multiplication, and hence yields differential operators. If $g$ satisfies $C[g] = 0$, and if we're trying to solve $D[y]=g$, then we might decide to solve more generally $CD[y] = 0$. The left-hand-side is $n+m$ dimensional, and if we're truly gifted at factoring polynomials, then we can solve it directly. Then the solutions to our original equation must be in this kernel.

14 October 2007

Divergent Series take 1

The following talk is significantly too long. Some parenthetical remarks are easy enough to excise, but what else should I drop?

The talk is available here (pdf). I will give it on Thursday at "Many Cheerful Facts", a brown-bag student-organized talk series in which different graduate students present general-audience material: if you know the subject already, you won't learn anything in the talk. Someone bakes something tasty each week.

Abstract: Whereas modern physicists write down divergent series all the time, mathematicians through the ages have been variously terrified or only mildly scared of such sums. In this talk, I will survey the most important methods of summing divergent series, and make general vague remarks about them. I will quote many results, but will studiously avoid proving anything.