The Unapologetic Mathematician

Mathematics for the interested outsider

Cramer’s Rule

We’re trying to invert a function f:X\rightarrow\mathbb{R}^n which is continuously differentiable on some region X\subseteq\mathbb{R}^n. That is we know that if a is a point where J_f(a)\neq0, then there is a ball N around a where f is one-to-one onto some neighborhood f(N) around f(a). Then if y is a point in f(N), we’ve got a system of equations

\displaystyle f^j(x^1,\dots,x^n)=y^j

that we want to solve for all the x^i.

We know how to handle this if f is defined by a linear transformation, represented by a matrix A=\left(a_i^j\right):

\displaystyle\begin{aligned}f^j(x^1,\dots,x^n)=a_i^jx^i&=y^j\\Ax&=y\end{aligned}

In this case, the Jacobian transformation is just the function f itself, and so the Jacobian determinant \det\left(a_i^j\right) is nonzero if and only if the matrix A is invertible. And so our solution depends on finding the inverse A^{-1} and solving

\displaystyle\begin{aligned}Ax&=y\\A^{-1}Ax&=A^{-1}y\\x&=A^{-1}y\end{aligned}

This is the approach we’d like to generalize. But to do so, we need a more specific method of finding the inverse.

This is where Cramer’s rule comes in, and it starts by analyzing the way we calculate the determinant of a matrix A. This formula

\displaystyle\sum\limits_{\pi\in S_n}\mathrm{sgn}(\pi)a_1^{\pi(1)}\dots a_n^{\pi(n)}

involves a sum over all the permutations \pi\in S_n, and we want to consider the order in which we add up these terms. If we fix an index i, we can factor out each matrix entry in the ith column:

\displaystyle\sum\limits_{j=1}^na_i^j\sum\limits_{\substack{\pi\in S_n\\\pi(i)=j}}\mathrm{sgn}(\pi)a_1^{\pi(1)}\dots\widehat{a_i^j}\dots a_n^{\pi(n)}

where the hat indicates that we omit the ith term in the product. For a given value of j, we can consider the restricted sum

\displaystyle A_j^i=\sum\limits_{\substack{\pi\in S_n\\\pi(i)=j}}\mathrm{sgn}(\pi)a_1^{\pi(1)}\dots\widehat{a_i^j}\dots a_n^{\pi(n)}

which is (-1)^{i+j} times the determinant of the i-j “minor” of the matrix A. That is, if we strike out the row and column of A which contain a_i^j and take the determinant of the remaining (n-1)\times(n-1) matrix, we multiply this by (-1)^{i+j} to get A_j^i. These are the entries in the “adjugate” matrix \mathrm{adj}(A).

What we’ve shown is that

\displaystyle A_j^ia_i^j=\det(A)

(no summation on i). It’s not hard to show, however, that if we use a different row from the adjugate matrix we find

\displaystyle\sum\limits_{j=1}^nA_j^ka_i^j=\det(A)\delta_i^k

That is, the adjugate times the original matrix is the determinant of A times the identity matrix. And so if \det(A)\neq0 we find

\displaystyle A^{-1}=\frac{1}{\det(A)}\mathrm{adj}(A)

So what does this mean for our system of equations? We can write

\displaystyle\begin{aligned}x&=\frac{1}{\det(A)}\mathrm{adj}(A)y\\x^i&=\frac{1}{\det(A)}A_j^iy^j\end{aligned}

But how does this sum A_j^iy^j differ from the one A_j^ia_i^j we used before (without summing on i) to calculate the determinant of A? We’ve replaced the ith column of A by the column vector y, and so this is just another determinant, taken after performing this replacement!

Here’s an example. Let’s say we’ve got a system written in matrix form

\displaystyle\begin{pmatrix}a&b\\c&d\end{pmatrix}\begin{pmatrix}x\\y\end{pmatrix}=\begin{pmatrix}u\\v\end{pmatrix}

The entry in the ith row and jth column of the adjugate matrix is calculated by striking out the ith column and jth row of our original matrix, taking the determinant of the remaining matrix, and multiplying by (-1)^{i+j}. We get

\displaystyle\begin{pmatrix}d&-b\\-c&a\end{pmatrix}

and thus we find

\displaystyle\begin{pmatrix}x\\y\end{pmatrix}=\frac{1}{ad-bc}\begin{pmatrix}d&-b\\-c&a\end{pmatrix}\begin{pmatrix}u\\v\end{pmatrix}=\frac{1}{ad-bc}\begin{pmatrix}ud-bv\\av-uc\end{pmatrix}

where we note that

\displaystyle\begin{aligned}ud-bv&=\det\begin{pmatrix}u&b\\v&d\end{pmatrix}\\av-uc&=\det\begin{pmatrix}a&u\\c&v\end{pmatrix}\end{aligned}

In other words, our solution is given by ratios of determinants:

\displaystyle\begin{aligned}x&=\frac{\det\begin{pmatrix}u&b\\v&d\end{pmatrix}}{\det\begin{pmatrix}a&b\\c&d\end{pmatrix}}\\y&=\frac{\det\begin{pmatrix}a&u\\c&v\end{pmatrix}}{\det\begin{pmatrix}a&b\\c&d\end{pmatrix}}\end{aligned}

and similar formulae hold for larger systems of equations.

November 17, 2009 Posted by | Algebra, Linear Algebra | 8 Comments

Another Lemma on Nonzero Jacobians

Sorry for the late post. I didn’t get a chance to get it up this morning before my flight.

Brace yourself. Just like last time we’ve got a messy technical lemma about what happens when the Jacobian determinant of a function is nonzero.

This time we’ll assume that f:X\rightarrow\mathbb{R}^n is not only continuous, but continuously differentiable on a region X\subseteq\mathbb{R}^n. We also assume that the Jacobian J_f(a)\neq0 at some point a\in X. Then I say that there is some neighborhood N of a so that f is injective on N.

First, we take n points \{z_i\}_{i=1}^n in X and make a function of them

\displaystyle h(z_1,\dots,z_n)=\det\left(\frac{\partial f^i}{\partial x^j}\bigg\vert_{x=z_i}\right)

That is, we take the jth partial derivative of the ith component function and evaluate it at the ith sample point to make a matrix \left(a_{ij}\right), and then we take the determinant of this matrix. As a particular value, we have

\displaystyle h(a,\dots,a)=J_f(a)\neq0

Since each partial derivative is continuous, and the determinant is a polynomial in its entries, this function is continuous where it’s defined. And so there’s some ball N of a so that if all the z_i are in N we have h(z_1,\dots,z_n)\neq0. We want to show that f is injective on N.

So, let’s take two points x and y in N so that f(x)=f(y). Since the ball is convex, the line segment [x,y] is completely contained within N\subseteq X, and so we can bring the mean value theorem to bear. For each component function we can write

\displaystyle0=f^i(y)-f^i(x)=df^i(\xi_i)(y-x)=\frac{\partial f^i}{\partial x^j}\bigg\vert_{\xi_i}(y^j-x^j)

for some \xi_i in [x,y]\subseteq N (no summation here on i). But like last time we now have a linear system of equations described by an invertible matrix. Here the matrix has determinant

\displaystyle\det\left(\frac{\partial f^i}{\partial x^j}\bigg\vert_{\xi_i}\right)=h(\xi_1,\dots,\xi_n)\neq0

which is nonzero because all the \xi_i are inside the ball N. Thus the only possible solution to the system of equations is x^i=y^i. And so if f(x)=f(y) for points within the ball N, we must have x=y, and thus f is injective.

November 17, 2009 Posted by | Analysis, Calculus | 2 Comments

   

Follow

Get every new post delivered to your Inbox.

Join 392 other followers