# The Unapologetic Mathematician

## Cramer’s Rule

We’re trying to invert a function $f:X\rightarrow\mathbb{R}^n$ which is continuously differentiable on some region $X\subseteq\mathbb{R}^n$. That is we know that if $a$ is a point where $J_f(a)\neq0$, then there is a ball $N$ around $a$ where $f$ is one-to-one onto some neighborhood $f(N)$ around $f(a)$. Then if $y$ is a point in $f(N)$, we’ve got a system of equations

$\displaystyle f^j(x^1,\dots,x^n)=y^j$

that we want to solve for all the $x^i$.

We know how to handle this if $f$ is defined by a linear transformation, represented by a matrix $A=\left(a_i^j\right)$:

\displaystyle\begin{aligned}f^j(x^1,\dots,x^n)=a_i^jx^i&=y^j\\Ax&=y\end{aligned}

In this case, the Jacobian transformation is just the function $f$ itself, and so the Jacobian determinant $\det\left(a_i^j\right)$ is nonzero if and only if the matrix $A$ is invertible. And so our solution depends on finding the inverse $A^{-1}$ and solving

\displaystyle\begin{aligned}Ax&=y\\A^{-1}Ax&=A^{-1}y\\x&=A^{-1}y\end{aligned}

This is the approach we’d like to generalize. But to do so, we need a more specific method of finding the inverse.

This is where Cramer’s rule comes in, and it starts by analyzing the way we calculate the determinant of a matrix $A$. This formula

$\displaystyle\sum\limits_{\pi\in S_n}\mathrm{sgn}(\pi)a_1^{\pi(1)}\dots a_n^{\pi(n)}$

involves a sum over all the permutations $\pi\in S_n$, and we want to consider the order in which we add up these terms. If we fix an index $i$, we can factor out each matrix entry in the $i$th column:

$\displaystyle\sum\limits_{j=1}^na_i^j\sum\limits_{\substack{\pi\in S_n\\\pi(i)=j}}\mathrm{sgn}(\pi)a_1^{\pi(1)}\dots\widehat{a_i^j}\dots a_n^{\pi(n)}$

where the hat indicates that we omit the $i$th term in the product. For a given value of $j$, we can consider the restricted sum

$\displaystyle A_j^i=\sum\limits_{\substack{\pi\in S_n\\\pi(i)=j}}\mathrm{sgn}(\pi)a_1^{\pi(1)}\dots\widehat{a_i^j}\dots a_n^{\pi(n)}$

which is $(-1)^{i+j}$ times the determinant of the $i$-$j$ “minor” of the matrix $A$. That is, if we strike out the row and column of $A$ which contain $a_i^j$ and take the determinant of the remaining $(n-1)\times(n-1)$ matrix, we multiply this by $(-1)^{i+j}$ to get $A_j^i$. These are the entries in the “adjugate” matrix $\mathrm{adj}(A)$.

What we’ve shown is that

$\displaystyle A_j^ia_i^j=\det(A)$

(no summation on $i$). It’s not hard to show, however, that if we use a different row from the adjugate matrix we find

$\displaystyle\sum\limits_{j=1}^nA_j^ka_i^j=\det(A)\delta_i^k$

That is, the adjugate times the original matrix is the determinant of $A$ times the identity matrix. And so if $\det(A)\neq0$ we find

$\displaystyle A^{-1}=\frac{1}{\det(A)}\mathrm{adj}(A)$

So what does this mean for our system of equations? We can write

\displaystyle\begin{aligned}x&=\frac{1}{\det(A)}\mathrm{adj}(A)y\\x^i&=\frac{1}{\det(A)}A_j^iy^j\end{aligned}

But how does this sum $A_j^iy^j$ differ from the one $A_j^ia_i^j$ we used before (without summing on $i$) to calculate the determinant of $A$? We’ve replaced the $i$th column of $A$ by the column vector $y$, and so this is just another determinant, taken after performing this replacement!

Here’s an example. Let’s say we’ve got a system written in matrix form

$\displaystyle\begin{pmatrix}a&b\\c&d\end{pmatrix}\begin{pmatrix}x\\y\end{pmatrix}=\begin{pmatrix}u\\v\end{pmatrix}$

The entry in the $i$th row and $j$th column of the adjugate matrix is calculated by striking out the $i$th column and $j$th row of our original matrix, taking the determinant of the remaining matrix, and multiplying by $(-1)^{i+j}$. We get

$\displaystyle\begin{pmatrix}d&-b\\-c&a\end{pmatrix}$

and thus we find

$\displaystyle\begin{pmatrix}x\\y\end{pmatrix}=\frac{1}{ad-bc}\begin{pmatrix}d&-b\\-c&a\end{pmatrix}\begin{pmatrix}u\\v\end{pmatrix}=\frac{1}{ad-bc}\begin{pmatrix}ud-bv\\av-uc\end{pmatrix}$

where we note that

\displaystyle\begin{aligned}ud-bv&=\det\begin{pmatrix}u&b\\v&d\end{pmatrix}\\av-uc&=\det\begin{pmatrix}a&u\\c&v\end{pmatrix}\end{aligned}

In other words, our solution is given by ratios of determinants:

\displaystyle\begin{aligned}x&=\frac{\det\begin{pmatrix}u&b\\v&d\end{pmatrix}}{\det\begin{pmatrix}a&b\\c&d\end{pmatrix}}\\y&=\frac{\det\begin{pmatrix}a&u\\c&v\end{pmatrix}}{\det\begin{pmatrix}a&b\\c&d\end{pmatrix}}\end{aligned}

and similar formulae hold for larger systems of equations.

November 17, 2009 Posted by | Algebra, Linear Algebra | 8 Comments

## Another Lemma on Nonzero Jacobians

Sorry for the late post. I didn’t get a chance to get it up this morning before my flight.

Brace yourself. Just like last time we’ve got a messy technical lemma about what happens when the Jacobian determinant of a function is nonzero.

This time we’ll assume that $f:X\rightarrow\mathbb{R}^n$ is not only continuous, but continuously differentiable on a region $X\subseteq\mathbb{R}^n$. We also assume that the Jacobian $J_f(a)\neq0$ at some point $a\in X$. Then I say that there is some neighborhood $N$ of $a$ so that $f$ is injective on $N$.

First, we take $n$ points $\{z_i\}_{i=1}^n$ in $X$ and make a function of them

$\displaystyle h(z_1,\dots,z_n)=\det\left(\frac{\partial f^i}{\partial x^j}\bigg\vert_{x=z_i}\right)$

That is, we take the $j$th partial derivative of the $i$th component function and evaluate it at the $i$th sample point to make a matrix $\left(a_{ij}\right)$, and then we take the determinant of this matrix. As a particular value, we have

$\displaystyle h(a,\dots,a)=J_f(a)\neq0$

Since each partial derivative is continuous, and the determinant is a polynomial in its entries, this function is continuous where it’s defined. And so there’s some ball $N$ of $a$ so that if all the $z_i$ are in $N$ we have $h(z_1,\dots,z_n)\neq0$. We want to show that $f$ is injective on $N$.

So, let’s take two points $x$ and $y$ in $N$ so that $f(x)=f(y)$. Since the ball is convex, the line segment $[x,y]$ is completely contained within $N\subseteq X$, and so we can bring the mean value theorem to bear. For each component function we can write

$\displaystyle0=f^i(y)-f^i(x)=df^i(\xi_i)(y-x)=\frac{\partial f^i}{\partial x^j}\bigg\vert_{\xi_i}(y^j-x^j)$

for some $\xi_i$ in $[x,y]\subseteq N$ (no summation here on $i$). But like last time we now have a linear system of equations described by an invertible matrix. Here the matrix has determinant

$\displaystyle\det\left(\frac{\partial f^i}{\partial x^j}\bigg\vert_{\xi_i}\right)=h(\xi_1,\dots,\xi_n)\neq0$

which is nonzero because all the $\xi_i$ are inside the ball $N$. Thus the only possible solution to the system of equations is $x^i=y^i$. And so if $f(x)=f(y)$ for points within the ball $N$, we must have $x=y$, and thus $f$ is injective.

November 17, 2009 Posted by | Analysis, Calculus | 2 Comments