# The Unapologetic Mathematician

## The Inclusion-Exclusion Principle

In combinatorics we have a method of determining the cardinality of unions in terms of the sets in question and their intersection: the Inclusion-Exclusion Principle. Predictably enough, this formula is reflected in the subspaces of a vector space. We could argue directly in terms of bases, but it’s much more interesting to use the linear algebra we have at hand.

Let’s start with just two subspaces $V$ and $W$ of some larger vector space. We’ll never really need that space, so we don’t need to give it a name. The thing to remember is that $V$ and $W$ might have a nontrivial intersection — their sum may not be direct.

Now we consider the sum of the subspaces. Every vector in $V+W$ is the sum of a vector in $V$ and a vector in $W$. This tells us that there’s a linear map $f_1:V\oplus W\rightarrow V+W$ defined by $f_1(v,w)=v+w$. Note carefully that this is linear as a function of the pair $(v,w)$. It’s not bilinear — linear in each of $v$ and $w$ separately — which would mean bringing in the tensor product.

This function is always surjective, but it may fail to be injective. Specifically, if $V$ and $W$ have a nontrivial intersection then it gives us a nontrivial kernel. If $x\in V\cap W$ then $f_1(x,-x)=0$. But rather than talk about elements, let’s use the intersection to parametrize the kernel! We’ve got another linear map $f_2:V\cap W\rightarrow V\oplus W$, defined by $f_2(x)=(x,-x)$.

Now it’s clear that $f_2$ has a trivial kernel. On the other hand, its image is precisely the set of pairs that $f_1$ kills off. That means we’ve got an exact sequence:

$\mathbf{0}\rightarrow V\cap W\xrightarrow{f_2}V\oplus W\xrightarrow{f_1}V+W\rightarrow\mathbf{0}$

Taking the Euler characteristic we find that

$\dim\left(V\cap W\right)-\dim\left(V\oplus W\right)+\dim\left(V+W\right)=0$

Some simple juggling turns this into

$\dim\left(V+W\right)=\dim\left(V\right)+\dim\left(W\right)-\dim\left(V\cap W\right)$

which is the inclusion-exclusion principle for two subspaces.

What about three subspaces $U$, $V$, and $W$? We start off again with

$U\oplus V\oplus W\xrightarrow{f_1}U+V+W\rightarrow\mathbf{0}$

defined by $f_1(u,v,w)=u+v+w$. Then the kernel contains triples like $(0,x,-x)$, $(-y,0,y)$, and $(z,-z,0)$. We use the pairwise intersections to cover the kernel by the map

$\left(V\cap W\right)\oplus\left(W\cap U\right)\oplus\left(U\cap V\right)\xrightarrow{f_2}U\oplus V\oplus W$

defined by $(x,y,z)=(z-y,x-z,y-x)$. The image of $f_2$ is exactly the kernel of $f_1$, but this time it has a nontrivial kernel itself! Now we need another map

$\mathbf{0}\rightarrow U\cap V\cap W\xrightarrow{f_3}\left(V\cap W\right)\oplus\left(W\cap U\right)\oplus\left(U\cap V\right)$

defined by $f_3(t)=(t,t,t)$. Now $f_3$ is injective, and its image is exactly the kernel of $f_2$. We stick these maps all together to have the long exact sequence

\begin{aligned}\mathbf{0}\rightarrow U\cap V\cap W\xrightarrow{f_3}\left(V\cap W\right)\oplus\left(W\cap U\right)\oplus\left(U\cap V\right)\\\xrightarrow{f_2}U\oplus V\oplus W\xrightarrow{f_1}U+V+W\rightarrow\mathbf{0}\end{aligned}

Taking the Euler characteristic and juggling again we find

\begin{aligned}\dim\left(U+V+W\right)=\dim\left(U\right)+\dim\left(V\right)+\dim\left(W\right)\\-\dim\left(U\cap V\right)-\dim\left(U\cap W\right)-\dim\left(V\cap W\right)\\+\dim\left(U\cap V\cap W\right)\end{aligned}

We include the dimensions of the individual subspaces, exclude the dimensions of the pairwise intersections, and include back in the dimension of the triple intersection.

For larger numbers of subspaces we can construct longer and longer exact sequences. The $k$th term consists of the direct sum of the intersections of the subspaces, taken $k$ at a time. The $k$th map consists of including each $k$-fold intersection into each of the $(k-1)$-fold intersections it lies in, with positive and negative signs judiciously sprinkled throughout to make the sequence exact.

The result is that the dimension of a sum of subspaces is the alternating sum of the dimensions of the $k$-fold intersections. First add the dimension of each subspace, then subtract the dimension of each pairwise intersection, then add back in the dimension of each triple intersection, and so on.

From here it’s the work of a moment to derive the combinatorial inclusion-exclusion principle. Given a collection of sets, just construct the free vector space on each of them. These free vector spaces will have nontrivial intersections corresponding to the nonempty intersections of the sets, and we’ve got canonical basis elements floating around to work with. Then all these dimensions we’re talking about are cardinalities of various subsets of the sets we started with, and the combinatorial inclusion-exclusion principle follows. Of course, as I said before we could have proved it from the other side, but that would require a lot of messy hands-on work with bases and such.

The upshot is that the combinatorial inclusion-exclusion principle is equivalent to the statement that exact sequences of vector spaces have trivial Euler characteristic! This little formula that we teach in primary or secondary school turns out to be intimately connected with one of the fundamental tools of homological algebra and the abstract approach to linear algebra. Neat!

July 24, 2008 - Posted by | Algebra, Linear Algebra

1. Beyond neat! Satisfies the true purpose of Mathematics: enlightenment.

Comment by Jonathan Vos Post | July 24, 2008 | Reply

2. What is also good about this observation is that this form of inclusion-exclusion holds in lattices more general than distributive lattices (which were the types of lattice emphasized in the post you linked to) — subspace lattices satisfy the weaker property of modularity.

Comment by Todd Trimble | July 24, 2008 | Reply

3. Indeed they do. That’s why I said as much last time when I talked about subspace sums. ;)

Comment by John Armstrong | July 24, 2008 | Reply

4. Sorry, I don’t know sequences, but it seems either you have a major error or I completely don’t understand your post.

dim(U+V+W)=dim(U)+dim(V)+dim(W)-dim(UV)-dim(UW)-dim(WV)+dim(UVW)

Consider R^2 as a vector space over R and let:
U={(x,y):x=0}
V={(x,y):y=0}
W={(x,y):x=y}

These are clearly subspaces. U+V=R^2 and their intersection is {0}. Same for U,W and V,W.

So the left side is dim(U+V+W)=dim(R^2)=2 and the right side is dim(U)+dim(V)+dim(W)=3 because the intersections vanish.

The problem is that in general (U+V) intersection W is not equal to (U intersection W) + (V intersection W). Take these 3 mentioned spaces as a counterexample. Once my class received a task to determine whether PIE is valid for n vector spaces. Of course everybody gave a “proof” of it.:)

Best, –k.g.

(Feel free to add TeX or correct my English; don’t know how to use it)

Comment by k.g. | July 26, 2008 | Reply

5. You’re right.. I’d implicitly assumed that the subspaces were in what we topologists like to call “general position”. I can see two remedies.

First, turn your reason why my proof failed into an algebraic statement about subspaces and their maps, then establish a new sequence (or show how to modify existing ones) to handle this degeneracy.

Alternately, scrap the given sequences altogether and give new ones which handle the degenerate case as well as the generic.

I’m not in a position to type either solution out myself on my iPhone from a restaurant in Meriden, MS. But good catch there.

Comment by John Armstrong | July 27, 2008 | Reply

• Dear Professor John:
I am very eager to know what you mean by “general position”,
Can you email me the answer through Email yangming198411@163.com ?

Comment by mingming | March 10, 2010 | Reply

• Essentially, if you wiggle the subspaces a bit, the dimensions of their intersections don’t change.

Comment by John Armstrong | March 10, 2010 | Reply

6. [...] inclusion-exclusion principle tells us [...]

Pingback by The Multiplicity of an Eigenvalue « The Unapologetic Mathematician | February 19, 2009 | Reply

7. [...] inclusion-exclusion principle tells us [...]

Pingback by The Multiplicity of an Eigenpair « The Unapologetic Mathematician | April 8, 2009 | Reply

8. You learned the inclusion-exclusion principle in primary school? I don’t think anybody I know who didn’t do math contests in high school has the faintest idea what it is.

Anyway, I’ve always found the use of the free vector space construction in combinatorics extremely mysterious. For example, in poset theory one can use certain linear operators to show that certain graded posets are rank-unimodal and Sperner, and some of the consequences of this method don’t have known “direct” proofs. So does the usefulness of linear algebra in combinatorics ultimately stem from categorical and “topological” properties of the category of vector spaces? What makes this category unique for combinatorial purposes?

Comment by Qiaochu Yuan | June 7, 2009 | Reply

9. Sure. It’s part of the remnants of the “New Math” hanging around the curriculum.

Comment by John Armstrong | June 7, 2009 | Reply

10. [...] have a nonempty intersection, which leads us to think that maybe this has something to do with the inclusion-exclusion principle. We’re “overcounting” the intersection by just adding, so let’s subtract it [...]

Pingback by Characteristic Functions as Idempotents « The Unapologetic Mathematician | December 23, 2009 | Reply

11. [...] little more about characteristic functions, let’s see how they can be used to understand the inclusion-exclusion principle. Our first pass was through a very categorified lens, which was a neat tie-in to Euler [...]

Pingback by Inclusion-Exclusion Again « The Unapologetic Mathematician | December 28, 2009 | Reply