There is a variety of problems where one searches for a vector $\mathbf{v}$ such that the squared sum of the lengths of projections of points $\mathbf{x}_i$ onto that vector is maximized or minimized:

\begin{aligned} \min_v E(\mathbf{v}) &= \sum_{i=1}^{n} \langle \mathbf{x}_i, \mathbf{v} \rangle ^2 = \sum_{i=1}^{n} \mathbf{x}_i^\intercal\mathbf{v} \cdot \mathbf{x}_i^\intercal\mathbf{v} \\ &= \sum_{i=1}^{n} \mathbf{v}^\intercal\mathbf{x}_i \cdot \mathbf{x}_i^\intercal\mathbf{v} = \mathbf{v}^\intercal\mathbf{X}^\intercal\mathbf{X}\mathbf{v} \\ \\ \text{s.t.}\quad\mathbf{v}^\intercal\mathbf{v} &= 1 \end{aligned}

This can be written as a Lagrange function by introducing a Lagrangian multiplier $\lambda$ for the constraint:

\begin{aligned} \mathcal{L}(\mathbf{v}, \lambda) = \mathbf{v}^\intercal\mathbf{X}^\intercal\mathbf{X}\mathbf{v} - \lambda (1 - \mathbf{v}^\intercal\mathbf{v}) \end{aligned}

Optimization of the original problem is equivalent to finding a solution to the following equation system:

$\nabla\!_{\mathbf{v},\lambda} \mathcal{L}(\mathbf{v}, \lambda) = 0 \Leftrightarrow \begin{cases} \mathbf{X}^\intercal\mathbf{X}\mathbf{v} = \lambda\mathbf{v} \\ \mathbf{v}^\intercal\mathbf{v} = 1 \end{cases}$

The first line of this equation tells us that $\mathbf{v}$ must be an eigenvector of $\mathbf{X}^\intercal\mathbf{X}$ with eigenvalue $\lambda$. Let’s call those $\mathbf{v}_e$ and $\lambda_e$. Plugging this solution back into our original problem (and remembering that the eigenvector is not necessarily unit length) we obtain:

$E\left(\frac{\mathbf{v}_e}{|\mathbf{v}_e|}\right) = \frac{\mathbf{v}_e^\intercal\mathbf{X}^\intercal\mathbf{X}\mathbf{v}_e}{\mathbf{v}_e^\intercal\mathbf{v}_e} = \frac{\mathbf{v}_e^\intercal\lambda\mathbf{v}_e}{\mathbf{v}_e^\intercal\mathbf{v}_e} = \lambda$

Depending on whether we want to obtain a minimum or a maximum we can choose the eigenvector with the smallest or largest eigenvalue respectively.