We established that if we restrict to upper shears we can generate all upper-unipotent matrices. On the other hand if we use all shears and scalings we can generate any invertible matrix we want (since swaps can be built from shears and scalings). We clearly can’t build any matrix whatsoever from shears alone, since every shear has determinant and so must any product of shears. But it turns out that we can use shears to generate any matrix of determinant — those in the special linear group.
First of all, let’s consider the following matrix equations, which should be easy to verify
These show that we can always pull a scaling to the left past a shear. In the first two cases, the scaling and the shear commute if the row and column the scaling acts on are uninvolved in the shear. In the last two cases, we have to modify the shear in the process, but we end up with the scaling written to the left of a shear instead of to the right. We can use these toy examples to see that we can always pull a scaling from the right to the left of a shear, possibly changing the shear in the process.
What does this mean? When we take a matrix and write it out in terms of elementary matrices, we can always modify this expression so that all the scalings are to the left of all the shears. Then we have a diagonal matrix to the left of a long product of shears, since the product of a bunch of scalings is a diagonal matrix. But now the determinant of each shear is , and the determinant of the diagonal matrix must be the product of the diagonal entries, which are the scaling factors. And so the product of the scaling factors is the determinant of our original matrix.
We’re specifically concerned with matrices of determinant , meaning the product of all the diagonal entries must come out to be . I’m going to use this fact to write the diagonal matrix as a product of scalings in a very particular way. Let’s say the diagonal entry in row is . Then I’m going to start by writing down
I’ve scaled the first row by the right amount, and then scaled the second row by the inverse amount so the product of the two scaling factors is . Then I write down
The product of the two scalings of the second row ends up scaling it by , and we scale the third row to compensate. We continue this way, scaling each row to the right amount, and the next one by the inverse factor. Once we scale the next-to-last row we’re done, since the scaling factor for the last row must be exactly what we need to make the total product of all the scaling factors come out to . That is, as long as the total scaling factor is , we can write the diagonal matrix as the product of these pairs of scalings with inverse scaling factors.
Now let’s take four shears, alternating upper and lower, since two upper shears in a row are the same as a single upper shear, and similarly for lower shears. We want it to come out to one of these pairs of scalings.
This gives us four equations to solve
These quickly simplify to
Which can be solved to find
So we could pick and for any scaling factor write
And so we can write such a pair of scalings with inverse scaling factors as a product of four shears. Since in the case at hand we can write the diagonal part of our elementary matrix decomposition with such pairs of scalings, we can translate them all into shears. And at the end of the day, we can write any special linear transformation as a product of a bunch of shears.