Math

The Rayleigh Quotient and the Norm Constraint

This post will try to explain why in the optimization of the Rayleigh Quotient one constrains the norm of $$x$$ to be $$\|x\| = 1$$

Let’s start with the definition, given a symmetric matrix, $$A$$, the Rayleigh quotient is defined as a function $$R(x)$$ from $$\mathbb{R}^{d}_{0}$$ to $$\mathbb{R}$$,

\begin{equation}
R(x) = \frac{x^T A x}{x^Tx}
\end{equation}

The maximization gives the following equivalent expression

\begin{equation}
\arg \max_{x} \frac{x^T A x}{x^Tx} = \arg \max_{x} \; x^T A x \; \textrm{sb.to.} \; \|x\|=1
\end{equation}

To untangle the norm constraint we can begin by remembering that a symmetric matrix is diagonalizable so we can write,

\begin{equation}
A=U \Lambda U^* = \sum_{i=1}^{d} \lambda_i U_i U_i^T,
\end{equation}

where $$\lambda_i$$ are the eigenvalues of $$A$$ and $$U_i$$ the orthonormal vectors from the diagonalization. We can express $$x$$ as a sum of the eigenvectors of $$A$$, that is, $$x = \sum_{i=1}^{d} \alpha_i v_i$$. This allows us to rewrite the quotient,

\begin{equation}
\begin{split}
x^Tx
& = \Big( \sum_{i=1}^{d} \alpha_i v_i \Big)^{T} \Big( \sum_{i=1}^{d} \alpha_i v_i \Big) \\
& = \{ \textrm{orthogonality + unit length} \} = \sum_{i=1}^{d} \alpha_i^2
\end{split}
\end{equation}

\begin{equation}
\begin{split}
x^T A x
& = \Big( \sum_{i=1}^{d} \alpha_i v_i \Big)^{T} \Big( \sum_{i=1}^{d} \alpha_i A v_i \Big)
= \Big( \sum_{i=1}^{d} \alpha_i v_i \Big)^{T} \Big( \sum_{i=1}^{d} \alpha_i \lambda_i v_i \Big) \\
& = \{ \textrm{orthogonality + unit length} \} = \sum_{i=1}^{d} \alpha_i^2 \lambda_i
\end{split}
\end{equation}

If we want to maximize the quotient we can now write the Rayleigh quotient as

\begin{equation}
\arg \max_{x} R(x) = \frac{x^T A x}{x^Tx} = \frac{ \sum_{i=1}^{d} \alpha_i^2 \lambda_i}{\sum_{i=1}^{d} \alpha_i^2}.
\end{equation}

We can see from the expression that the length of $$x$$ does not matter. The maximization is in the direction of $$x$$ and not in the length of $$x$$, that is, if some $$x$$ maximizes the quotient then so does any multiple of it, $$k \cdot x, \; k \neq 0$$. This means that we can constrain $$x$$ such that, $$\sum_{i=1}^{d} \alpha_i^2=1$$, which means that we can reduce the problem to maximizing $$x^T A x$$.

The same thing can be shown for the generalized Rayleigh quotient,

\begin{equation}
\arg \max_{x} R(x) = \frac{x^T A x}{x^T B x}
\end{equation}

by proceeding in the same manner.

Leave a Reply

Your email address will not be published. Required fields are marked *