One of the crimes of our current electoral system — the United States picks its President in about as undemocratic a method imaginable — is that the same four million people pick the candidates for both major parties every year.

Wikipedia lists the fifty states' and six non-state U.S. territories' populations, based on official estimates by the U.S. Census Bureau. In light of next month's high-profile primary elections, the numbers are remarkable.

Elections are very expensive: candidates for primaries must go door-to-door, and raise money for media buys. It would be highly undemocratic for California to be the first primary. Eventually, the nominee must win California, but s/he should not have to compete there until most of the candidates have been culled from the game, and fundraising is funneled to only a few folks who've had lots of free media. Indeed, if California was the first primary, you can be sure that only the candidates who come into the game with big names and lots of personal money would be competitive. (Perhaps, of course, this is better? I wouldn't mind a system that rewarded career politicians who worked their way to the top after long terms in the Senate, Cabinet, and state Governor mansions.) So we must, if we are to have some semblance of a meritocratic democracy, begin the nominating contest in small states.

With that said, the current choices of Iowa, New Hampshire, and South Carolina cannot be justified.

Iowa, with just shy of three million people, is not large. It has five members of the House of Representatives, and its population is largely white, pro-farm, and anti-immigrant. New Hampshire is legitimately small: it has only 1.3 million people and two representative in Congress. South Carolina, on the other hand, houses more than four million people, and is the only early primary with a sizable non-white population.

But Iowa is larger than twenty other states (and the District of Columbia), any of whom would make a reasonable first-in-the-nation primary. From smallest to largest, they are:

Wyoming, DC, Vermont, North Dakota, Alaska, South Dakota, Delaware, Montana, Rhode Island, Hawaii, New Hampshire, Maine, Idaho, Nebraska, West Virginia, New Mexico, Nevada, Utah, Kansas, Arkansas, and Mississippi.

These twenty one voting districts represent the full range of American demographics. DC is urban, Black, and overwhelmingly Democratic. Hawaii is majority Asian; New Mexico has a large Mexican immigrant population. Many of the small states are rural and conservative, while others, like Rhode Island and Delaware, easily represent the metropolitan North East. Vermont has a bizarre politics all to its own.

The 2008 election cycle will begin with the same four million people that have started the game for the last twenty years. Four million people whom the candidates, when they are in their most pandering mode, call "uniquely qualified" to pick a candidate. Are the rest of us stupid and uneducated? Are North Dakotans or Hawaiians somehow less able to consider the candidates' experiences and abilities? A democratic system should not enfranchise only those who are educated, or moneyed, or otherwise "qualified" to vote.

In a better system, a bipartisan, independent commission, taking input from major party leaders, would select the Primary order each cycle. Their selections should, by law or policy, begin always with a couple small states — say, two states from the twenty smallest. Criteria should include some complementarity: one liberal, urban population, with one rural conservative one, say. And, most importantly, their selections should be different from year to year. In one cycle, Vermont and Idaho start off; in another, Utah and Delaware.

There have been many complaints this year about the election starting too early. It has been an expensive year (see, for instance, the listings at opensecrets.org): Clinton and Obama have each raised close to one hundred million dollars and spend forty million of it; Romney has raised and spent around sixty million. The general election will probably cost each party around a billion. And it's exhausting, and a distraction from the important work of running the country. Oregon's Governor, for instance, is out in Iowa stumping for Clinton, rather than doing his gubernatorial work. Will the 2012 election begin similarly early? It wouldn't if the Primary committee waited to announce the order, and perhaps was empowered also to announce the dates.

A system like this — or, indeed, any proposal to reform the primaries — would most likely require, in order to go into effect, the commitment of both parties as well as Congressional involvement. That's a tough order, but a necessary one if we are going to have a truly democratic democracy.

## 27 December 2007

## 13 December 2007

### Linear Differential Equations

In the calculus class I'm TAing, we spent some time learning how "the method of undetermined coefficients" could be used to solve linear differential equations. I have never taken a first-year differential equations class, so although I'd solved many differential equations this way, I had never really though about such methods with any real theory. My goal in this entry is to describe the method and explain why it works, using more sophisticated ideas than in a first-year class, but still remaining very elementary. I had hoped to have this written and posted weeks ago; as it is, I'm writing it while proctoring the final exam for the class.

First, let me remind you of the set-up of the problem. We are trying to solve a non-homogeneous differential equation with constant coefficients:

$$ a_n y^{(n)} + \dots + a_1 y' + a_0 y = g(x) $$

I will assume that $a_n$ is not 0; then the left-hand side defines a linear operator $D$ of on the space of functions, with $n$-dimensional kernel.

We can diagonalize this operator by Fourier-transforming: if $y = e^{rx}$, then $D[y] = a(r) e^{rx}$, where $a(r)$ is the polynomial $a_n r^n + \dots + a_1 r + a_0$. If $a(r)$ has no repeated roots, then we can immediately read off a basis for the kernel as $e^{rx}$ for $r$ ranging over the $n$ roots. If there is a repeated root, then $a'(r)$ and $a(r)$ share a common factor; $a'(r)$ corresponds to the operator

$$ E[y] = a_n n y^{(n-1)} + \dots + a_1 $$

Then, since $\frac{d^k}{dx^k} [x y] = x y^{(k)} + k y^{(k-1)}$, we see that

$$ D[x y] = x D[y] + E[y] $$

so $r$ is a repeated root of $a(r)$ if and only if $e^{rx}$ and $x e^{rx}$ are zeros of $D$.

More generally, linear differential operators with constant coefficients satisfy a Leibniz rule:

$$ D[p(x) e^{rx}] = \left( p(x) a(r) + p'(x) a'(r) + \dots + p^{(n)}(x) a^{(n)}(r) \right) e^{rx} $$

for polynomial $p(x)$.

Thus, our ability to solve linear homogeneous differential equations with constant coefficients depends exactly on our ability to factor polynomials; for example, we can always solve second-order equations, by using the quadratic formula.

But, now, what if $g(x) \neq 0$? I will assume that, through whatever conniving factorization methods we use, we have found the entire kernel of $D$. Then our problem will be solved if we can find one solution to $D[y] = g(x)$; all other solutions are the space through this function parallel to the kernel. In calculus, we write this observation as "$y_g = y_c + y_p$" where g, c, and p stand for "general", "[c]homogenous", and "particular", respectively.

For a general $g$, we can continue to work in the Fourier basis: Fourier transform, sending $D[y] \mapsto a(r)$, then divide and integrate to transform back. This may miss some solutions at the poles, and is computationally difficult: integrating is as hard as factoring polynomials. For second-order equations, we can get to "just an integral" via an alternative method, by judiciously choosing how to represent functions in terms of the basis of solutions for $D[y]=0$.

But for many physical problems, $g(x)$ is an especially simple function (and, of course, it can always be Fourier-transformed into one).

In particular, let's say that $g(x)$ is, for example, a sum of products of exponential (and sinosoidal) and polynomial functions. I.e. let's say that $g(x)$ is a solution to some homogeneous linear constant-coefficient $C[g(x)] = 0$. Another way of saying this: let's say that the space spanned by all the derivatives $g$, $g'$, $g''$, etc., is finite-dimensional. If it has dimension $= m$, then $g^{(m)}(x)$ is a linear combination of lower-order derivatives, and thus I can find a minimal (order) $C$ of degree $m$ so that $C[g] = 0$. By the Leibniz rule, functions with this property form a ring. When I add two functions, the dimensions of their derivative spaces no more than add; when I multiply, the dimensions no worse than multiply. Indeed, by the earlier discussion, we have an exact description of such functions: they are precisely sums of products of $x^s e^{rx}$ ($s$ an non-negative integer).

In any case, let's say $g$ is of this form, i.e. we have $C$ (with degree $m$) minimal such that $C[g] = 0$, and first let's assume that the kernels of $C$ and $D$ do not intersect. Then $D$ acts as a one-to-one linear operator on finite-dimensional space $\ker C$, which by construction is spanned by the derivatives of $g$, and so must be onto. I.e. there is a unique point $y_p(x) = b_0 g(x) + b_1 g'(x) + \dots + b_{m-1} g^{(m-1)}(x) \in \ker C$ so that $D[y_p] = g$. Finding it requires only solving the system of linear equations in the $b_i$.

If, however, $\ker C$ and $\ker D$ intersect, then we will not generically be able to solve this system of linear equations. Because $C$ is minimal, $g$ is not a linear combination of fewer than $m$ of its derivatives; if $D$ sends (linear combinations of) some of those derivatives to 0, we will never be able to get a map onto $g$. Let me say this again. If $D$ does not act one-to-one on $\ker C$, but $g$ is in the range of this matrix, then $g$ is in a smaller-than-$m$-dimensional space closed under a differential operator; thus, there is a differential operator of lower-than-$m$ degree that annihilates $g$.

How can we then solve the equation? By the Leibniz rule, we observed earlier, $(xg)' = g + x g'$, and so

$$\frac{d^k}{dx^k}[x g(x)] = x g^{(k)}(x) + k g^{(k-1)}(x)$$

Then $C[xg]$ is a linear combination of derivatives of $g$; i.e. $C[xg] \in \ker C$. If we take the space spanned by the derivatives of $x g(x)$, it is one dimension larger than $\ker C$. We can repeat this trick ---

$$ \frac{d^k}{dx^k}[ x^p g(x) ] = \sum_{i=0}^k \frac{k!p!}{i!(k-i)!(p-k+i)!} x^{p-k+i} g^{(i)}(x) $$

--- and eventually get a space that's $n+m$ dimensional, containing, among other things, $g$, and closed under differentiation. The kernel of $D$ in this larger space is at most $n$ dimensional (since $n = \dim \ker D$ in the space of all functions), and so the range is $m$-dimensional, and must contain $g$: the system of linear equations is solvable.

Of course, we often can stop before getting all the way to $n$ extra dimensions. But so long as we are only interested in functions that are zeros of constant linear differential operators, then we can always solve differential equations. For example, every linear equation from a physics class, and almost every one in first-year calculus, is solvable with this method.

One final remark:

Composition of differential operators follows matrix multiplication, and hence yields differential operators. If $g$ satisfies $C[g] = 0$, and if we're trying to solve $D[y]=g$, then we might decide to solve more generally $CD[y] = 0$. The left-hand-side is $n+m$ dimensional, and if we're truly gifted at factoring polynomials, then we can solve it directly. Then the solutions to our original equation must be in this kernel.

First, let me remind you of the set-up of the problem. We are trying to solve a non-homogeneous differential equation with constant coefficients:

$$ a_n y^{(n)} + \dots + a_1 y' + a_0 y = g(x) $$

I will assume that $a_n$ is not 0; then the left-hand side defines a linear operator $D$ of on the space of functions, with $n$-dimensional kernel.

We can diagonalize this operator by Fourier-transforming: if $y = e^{rx}$, then $D[y] = a(r) e^{rx}$, where $a(r)$ is the polynomial $a_n r^n + \dots + a_1 r + a_0$. If $a(r)$ has no repeated roots, then we can immediately read off a basis for the kernel as $e^{rx}$ for $r$ ranging over the $n$ roots. If there is a repeated root, then $a'(r)$ and $a(r)$ share a common factor; $a'(r)$ corresponds to the operator

$$ E[y] = a_n n y^{(n-1)} + \dots + a_1 $$

Then, since $\frac{d^k}{dx^k} [x y] = x y^{(k)} + k y^{(k-1)}$, we see that

$$ D[x y] = x D[y] + E[y] $$

so $r$ is a repeated root of $a(r)$ if and only if $e^{rx}$ and $x e^{rx}$ are zeros of $D$.

More generally, linear differential operators with constant coefficients satisfy a Leibniz rule:

$$ D[p(x) e^{rx}] = \left( p(x) a(r) + p'(x) a'(r) + \dots + p^{(n)}(x) a^{(n)}(r) \right) e^{rx} $$

for polynomial $p(x)$.

Thus, our ability to solve linear homogeneous differential equations with constant coefficients depends exactly on our ability to factor polynomials; for example, we can always solve second-order equations, by using the quadratic formula.

But, now, what if $g(x) \neq 0$? I will assume that, through whatever conniving factorization methods we use, we have found the entire kernel of $D$. Then our problem will be solved if we can find one solution to $D[y] = g(x)$; all other solutions are the space through this function parallel to the kernel. In calculus, we write this observation as "$y_g = y_c + y_p$" where g, c, and p stand for "general", "[c]homogenous", and "particular", respectively.

For a general $g$, we can continue to work in the Fourier basis: Fourier transform, sending $D[y] \mapsto a(r)$, then divide and integrate to transform back. This may miss some solutions at the poles, and is computationally difficult: integrating is as hard as factoring polynomials. For second-order equations, we can get to "just an integral" via an alternative method, by judiciously choosing how to represent functions in terms of the basis of solutions for $D[y]=0$.

But for many physical problems, $g(x)$ is an especially simple function (and, of course, it can always be Fourier-transformed into one).

In particular, let's say that $g(x)$ is, for example, a sum of products of exponential (and sinosoidal) and polynomial functions. I.e. let's say that $g(x)$ is a solution to some homogeneous linear constant-coefficient $C[g(x)] = 0$. Another way of saying this: let's say that the space spanned by all the derivatives $g$, $g'$, $g''$, etc., is finite-dimensional. If it has dimension $= m$, then $g^{(m)}(x)$ is a linear combination of lower-order derivatives, and thus I can find a minimal (order) $C$ of degree $m$ so that $C[g] = 0$. By the Leibniz rule, functions with this property form a ring. When I add two functions, the dimensions of their derivative spaces no more than add; when I multiply, the dimensions no worse than multiply. Indeed, by the earlier discussion, we have an exact description of such functions: they are precisely sums of products of $x^s e^{rx}$ ($s$ an non-negative integer).

In any case, let's say $g$ is of this form, i.e. we have $C$ (with degree $m$) minimal such that $C[g] = 0$, and first let's assume that the kernels of $C$ and $D$ do not intersect. Then $D$ acts as a one-to-one linear operator on finite-dimensional space $\ker C$, which by construction is spanned by the derivatives of $g$, and so must be onto. I.e. there is a unique point $y_p(x) = b_0 g(x) + b_1 g'(x) + \dots + b_{m-1} g^{(m-1)}(x) \in \ker C$ so that $D[y_p] = g$. Finding it requires only solving the system of linear equations in the $b_i$.

If, however, $\ker C$ and $\ker D$ intersect, then we will not generically be able to solve this system of linear equations. Because $C$ is minimal, $g$ is not a linear combination of fewer than $m$ of its derivatives; if $D$ sends (linear combinations of) some of those derivatives to 0, we will never be able to get a map onto $g$. Let me say this again. If $D$ does not act one-to-one on $\ker C$, but $g$ is in the range of this matrix, then $g$ is in a smaller-than-$m$-dimensional space closed under a differential operator; thus, there is a differential operator of lower-than-$m$ degree that annihilates $g$.

How can we then solve the equation? By the Leibniz rule, we observed earlier, $(xg)' = g + x g'$, and so

$$\frac{d^k}{dx^k}[x g(x)] = x g^{(k)}(x) + k g^{(k-1)}(x)$$

Then $C[xg]$ is a linear combination of derivatives of $g$; i.e. $C[xg] \in \ker C$. If we take the space spanned by the derivatives of $x g(x)$, it is one dimension larger than $\ker C$. We can repeat this trick ---

$$ \frac{d^k}{dx^k}[ x^p g(x) ] = \sum_{i=0}^k \frac{k!p!}{i!(k-i)!(p-k+i)!} x^{p-k+i} g^{(i)}(x) $$

--- and eventually get a space that's $n+m$ dimensional, containing, among other things, $g$, and closed under differentiation. The kernel of $D$ in this larger space is at most $n$ dimensional (since $n = \dim \ker D$ in the space of all functions), and so the range is $m$-dimensional, and must contain $g$: the system of linear equations is solvable.

Of course, we often can stop before getting all the way to $n$ extra dimensions. But so long as we are only interested in functions that are zeros of constant linear differential operators, then we can always solve differential equations. For example, every linear equation from a physics class, and almost every one in first-year calculus, is solvable with this method.

One final remark:

Composition of differential operators follows matrix multiplication, and hence yields differential operators. If $g$ satisfies $C[g] = 0$, and if we're trying to solve $D[y]=g$, then we might decide to solve more generally $CD[y] = 0$. The left-hand-side is $n+m$ dimensional, and if we're truly gifted at factoring polynomials, then we can solve it directly. Then the solutions to our original equation must be in this kernel.

Subscribe to:
Posts (Atom)