$\def\cP{\mathcal P}$ $\def\de{\text{d}}$ $\def\N{\mathbb N}$ $\def\P{\mathbb P}$ $\def\bi{\boldsymbol i}$ $\def\bG{\boldsymbol G}$ $\def\bsigma{\boldsymbol \sigma}$ $\def\bv{\boldsymbol v}$ $\def\bu{\boldsymbol u}$ $\def\bt{\boldsymbol t}$ $\def\bw{\boldsymbol w}$ $\def\sT{\mathsf T}$ $\def\bW{\boldsymbol W}$ $\def\bA{\boldsymbol A}$ $\def\R{\mathbb R}$ $\def\S{\mathbb S}$ $\def\GOE{\text{GOE}}$ $\def\|{\Vert}$ $\def\bx{\boldsymbol x}$ $\def\cN{\mathcal N}$ $\def\E{\mathbb E}$ $\def\de{\text{d}}$ $\def\vphi{\varphi}$ $\def\bQ{\boldsymbol Q}$ $\def\diag{\text{diag}}$ $\def\bzero{\boldsymbol 0}$ $\def\id{\mathbf I}$ $\def\ones{\mathbf 1}$ $\def\ext{\text{ext}}$ $\def\|{\Vert}$ $\def\bLambda{\boldsymbol \Lambda}$ $\def\const{\text{const}}$ $\def\Unif{\text{Unif}}$ $\def\bSigma{\boldsymbol \Sigma}$ $\def\C{\mathbb C}$ $\def\tr{\text{tr}}$

## 1. Introduction

In the previous post, we saw how to use the replica method to calculate the spectral norm of the spiked GOE matrix. Besides the spectral norm, another interesting quantity of random matrices is the spectral density: the limiting empirical distribution of eigenvalues. In this post, we show how to calculate the spectral density of random matrices using the replica method. Let us take the GOE matrix as an example.

The prerequisite knowledge for this post is Stieltjes transforms. For a basic introduction to Stieltjes transforms and its applications to the GOE matrix, I recommend reading Terrence Tao’s blog. The shape of the spectral density of the GOE matrix is a semicircle, and hence this famous result is called the semicircle law.

## 2. Spectral density and Stieltjes transforms

Let $\bA_n \in \R^{n \times n}$ be a symmetric matrix. Let $z \in \C_+ = \{ \xi \in \C: \Im \xi > 0 \}$ ($\Im \xi$ is the imaginary part of $\xi$). We denote the resolvent of $\bA_n$ by

$R_{\bA_n}(z) = (\bA_n - z \id_n)^{-1}.$

Denoting the $i$‘th largest eigenvalue of $\bA_n$ by $\lambda_i(\bA_n)$, and the empirical eigenvalue distribution (spectral density) of $\bA_n$ by $\mu_{\bA_n} = \frac{1}{n} \sum_{i=1}^n \delta_{\lambda_i(\bA_n)}.$

For a probability measure $\mu \in \cP(\R)$ on the real line, we denote its Stieltjes transform by $s_{\mu}(z) = \int \frac{1}{\lambda - z} \mu(\de \lambda).$ The next two lemmas characterize the properties of Stieltjes transforms.

[Inverse Stieltjes transforms] The scaled imaginary part of the Stieltjes transform $(1/ \pi) \Im s_{\mu}(\cdot + \bi b)$ for $b>0$ is a probability measure on the real line $\R$. The sequence of probability measures $(1/ \pi) \Im s_{\mu}(\cdot + \bi b)$ indexed by $b$ converges weakly to $\mu$ as $b \to 0+$.
[Stieltjes continuity] Let $\mu_n$ be a sequence of random probability measures on the real line, and let $\mu$ be a deterministic probability measure. Then $\mu_n$ converges in expectation to $\mu$ in the vague topology if and only if $\E s_{\mu_n}(z)$ converges to $s_\mu(z)$ for every $z$ in the upper half-plane.

The next theorem gives the limiting Stieltjes transform of the GOE matrix. Combining with the two lemmas above, we recover the semicircle law of the GOE matrix.

[Stieltjes transform of the GOE matrix] Let $\bW_n \sim \GOE(n)$. That is, $\bW_n \in \R^{n \times n}$ is a symmetric matrix with $(W_{n, ii})_{1 \le i \le n} \sim_{i.i.d.} \cN(0, 2/n)$ and $(W_{n, ij})_{1 \le i < j \le n} \sim_{i.i.d.} \cN(0, 1/n)$. Denote by $s_{\bW_n} = s_{\mu_{\bW_n}}$ the Stieltjes transform of empirical eigenvalue distribution of $\bW_n$. Then for $z \in \C_+$, we have $\lim_{n \to \infty}\E[s_{\bW_n}(z)] = \frac{-z + \sqrt{z^2 - 4}}{2}.$

Combining the above theorem with the Stieltjes continuity lemma, for $% $, we have $\lim_{n \to \infty} \E[\# \{ \lambda_i(\bW_n): \lambda_i(\bW_n) \in [a, b] \} / n] = \sigma_{\rm sc}([a, b]),$ where $\sigma_{\rm sc}(\de x) = \frac{1}{2\pi} \sqrt{4 - x^2} \cdot \ones\{ x \in [-2, 2] \} \de x.$

In the following sections, I will formally calculate $\lim_{n \to \infty}\E[s_{\bW_n}(z)]$ using the replica method.

## 3. The determinant trick

### 3.1. The connection of determinant and Stieltjes transform

Let $\log$ be the complex log function defined on $\C$, with branch cut along the negative real axis. Define function $D_{\bW_n}$ as $D_{\bW_n}(\xi) = \frac{1}{n} \sum_{i=1}^n \log(\lambda_i(\bW_n) - \xi).$ Then the derivative of $D_{\bW_n}$ gives the negative Stieltjes transform $\frac{\de}{\de \xi} D_{\bW_n}(\xi) = - \frac{1}{n} \sum_{i=1}^n \frac{1}{\lambda_i(\bW_n) - \xi} = - s_{\bW_n}(\xi).$ The function $D_{\bW_n}$ is almost the normalized log-determinant of $\bW_n$, up to a phase shift $D_{\bW_n}(\xi) = \frac{1}{n}\log \det(\bW_n - \xi \id_n) + \frac{2 \pi \bi k(\bW_n, \xi)}{n},$ where $k(\bW_n, \xi)$ is an integer. Moreover, $k(\bW_n, \xi)$ will remain the same under an infinitesimal change of $\xi$. Hence we have \begin{align} &~\lim_{n \to \infty} \E[s_{\bW_n}(\xi)] = - \lim_{n \to \infty} \frac{\de}{\de \xi} \E[D_{\bW_n}(\xi)] = - \lim_{n \to \infty} \frac{\de}{\de \xi} \frac{1}{n} \E[\log \det(\bW_n - \xi \id_n)]\nonumber \\ \stackrel{\cdot}{=}&~ - \frac{\de}{\de \xi} \lim_{n \to \infty} \frac{1}{n} \E[\log \det(\bW_n - \xi \id_n)]\tag{1}\label{eq:1} \end{align} In the last step, we heuristically exchanged the limit operator and the differential operator.

Determinants are easier to work with than Stieltjes transforms. The following identity expresses the power of determinant in terms of integration of exponentials $\det(\bA)^{-k/2} = \int_{\R^n} \frac{1}{(2 \pi)^{nk/2}} \exp\Big\{ - \frac{1}{2} \sum_{j = 1}^k \langle \bx_j, \bA \bx_j \rangle \Big\} \prod_{j \in [k]} \de \bx_j.$ This identity holds whenever $\bA \in \R^{n \times n}$ is positive semi-definite. Though, we will formally use this identity for a complex matrix $\bA$.

### 3.2. The replica approach

In order to calculate $\E[s_{\bW_n}(\xi)]$, by Eq. \eqref{eq:1}, we need to calculate $\lim_{n\to \infty}(1/n)\E[\log \det(\bW_n - \xi \id_n)]$, and then differentiate with respect to $\xi$. However, there is no straightforward way to calculate the expectation of $\log$. One possible way is to use the replica formula introduced in my last post $\E[\log Z] = \lim_{k \to 0} \frac{1}{k} \log \E[Z^k].$ The replica formula reduces the problem to calculating the moments $\E[\det(\bW_n - \xi \id_n)^k]$.

### 3.3. An easier approach

Instead of calculating $\E[\det(\bW_n - \xi \id_n)^k]$ for a sequence of $k$, we just calculate $\E[\det(\bW_n - \xi \id_n)^{-1/2}]$. Note $\frac{1}{n} \log \det(\bW_n - \xi \id_n ) = -\frac{2}{n} \log [\det(\bW_n - \xi \id_n)^{-1/2}].$ We expect that $n^{-1} \log \det(\bW_n - \xi \id_n )$ concentrates tightly around its mean, so that \begin{align} &~\lim_{n \to \infty} \frac{1}{n} \E[ \log \det(\bW_n - \xi \id_n)] \stackrel{\cdot}{=} \lim_{n \to \infty} \frac{1}{n}\log \det(\bW_n - \xi \id_n)] \nonumber\\ \stackrel{\cdot}{=}&~ \lim_{n \to \infty} - \frac{2}{n}\log \det(\bW_n - \xi \id_n)^{-1/2}] \stackrel{\cdot}{=} \lim_{n \to \infty} -\frac{2}{n} \log \E[\det(\bW_n - \xi \id_n)^{-1/2}]. \tag{2}\label{eq:2} \end{align} This is completely heuristic, but it finally gives the correct answer.

## 4. The replica calculations

In this section, we use the replica method to calculate the last expression of Eq. \eqref{eq:2}, and then derive the expression of the limiting Stieltjes transform. We will mostly use formal calculations that are not rigorous.

### 4.1. Step 1: Get rid of the expectation operator

The first step of the calculation is to get rid of the expectation operator. Suppose we need to calculate $\E[f(\bw)]$, where $\bw \sim \cN(0, \id_d)$. First we need to find a way to express $f$ as the following integral form $f(\bw) = \int_{\R^n} \exp\{ \langle \bw, \bt(\bx)\rangle \} h(\bx) \de \bx,$ such that using the formula of the moment generating function of Gaussian random variables $\E_{G \sim \cN(0, 1)}[e^{tG}] = e^{t^2/2}$, we get $\E[f(\bw)] = \int_{\R^n} \exp\{ \Vert \bt(\bx) \Vert_2^2/2 \} h(\bx) \de \bx.$

For the current example, note a formal identity gives (it is formal because $\bW_n - \xi \id_n$ is a complex matrix)

$\det(\bW_n - \xi \id_n)^{-1/2} = \int_{\R^{n}} \frac{1}{(2 \pi)^{n/2}} \exp\Big\{ - \bx^\sT (\bW_n - \xi \id_n) \bx / 2 \Big\} \de \bx.$

Let $\bG = (G_{ij})_{ij \in [n]} \in \R^{n \times n}$ with $G_{ij} \sim_{iid} \cN(0, 1)$. Then the distributions of $\bW_n$ and $(\bG + \bG^\sT)/\sqrt{2n}$ are identital. As a consequence, we get \begin{aligned} \E[\det(\bW_n - \xi \id_n)^{-1/2}] =&~ \E\Big[ \int_{\R^{n}} \frac{1}{(2 \pi)^{n/2}} \exp\Big\{ - \Big\langle (\bG + \bG^\sT) / \sqrt {2n} - \xi \id_n, \bx \bx^\sT \Big\rangle / 2 \Big\} \de \bx \Big]\\ =&~ \E\Big[ \int_{\R^{n}} \frac{1}{(2 \pi)^{n/2}} \exp\Big\{ - \Big\langle \bG , \bx \bx^\sT \Big\rangle / \sqrt {2n} + \xi \Vert \bx \Vert_2^2 /2 \Big\} \de \bx \Big]\\ =&~ \int_{\R^{n}} \frac{1}{(2 \pi)^{n/2}} \exp\Big\{ \xi \Vert \bx \Vert_2^2 /2 \Big\} \E\Big[\exp\Big\{ - \Big\langle \bG , \bx \bx^\sT \Big\rangle / \sqrt {2n} \Big\}\Big] \de \bx. \end{aligned} Using the formula of the moment generating function of Gaussian random variables, we get \begin{align} \E[\det(\bW_n - \xi \id_n)^{-1/2}] =&~ \int_{\R^{n}} \frac{1}{(2 \pi)^{n/2}} \exp\Big\{ \xi \Vert \bx \Vert_2^2 /2 + \Vert \bx \bx^\sT \Vert_F^2 / (4n) \Big\} \de \bx \nonumber \\ =&~ \int_{\R^{n}} \frac{1}{(2 \pi)^{n/2}} \exp\Big\{ \xi \Vert \bx \Vert_2^2 /2 + \Vert \bx \Vert_2^4 / (4n) \Big\} \de \bx. \tag{3}\label{eq:3} \\ \end{align} Now, we have got rid of the expectation operator, and expressed the quantity of interest in terms of an integral.

### 4.2. Step 2: Calculate the integral

Eq. \eqref{eq:3} is an intractable high dimensional integration. In statistical physics, there is a systematic approach to heuristically deal with such an integration.

#### 4.2.1. A systematic approach to deal with the integration

Suppose $I_n$ is an integration of form $I_n = \int_{\R^n} \exp\{ n \cdot f(h_1(\bx), \ldots, h_k(\bx)) \} \de \bx,$ where $h_j$’s are in the form (with a slight abuse of notation) $h_j(x_1, \ldots, x_n) = \frac{1}{n}\sum_{i = 1}^n h_j(x_i).$ First by the identity that \begin{align} \int_\R \delta(n h_j(\bx) - n s_j) \de s_j = 1, \end{align} we have $I_n = \int_{\R^k} \prod_{j \in [k]} \de s_j \cdot \exp\{ n \cdot f(s_1, \ldots, s_k) \} \int_{\R^n} \prod_{j \in [k]} \delta( n h_j(\bx) - n s_j) \de \bx.$ Then, by the delta identity formula $\delta(x) = \int_{\bi \R} \exp\{ \lambda x \} \de [\lambda/ (2 \pi)],$ we get \begin{aligned} I_n =&~ \int_{\R^k} \prod_{j \in [k]} \de s_j \cdot \exp\{ n \cdot f(s_1, \ldots, s_k) \} \int_{(\bi \R)^k} \prod_{j \in [k]} \de [\lambda_j / (2\pi)] \int_{\R^n} \exp\Big\{ \sum_{j = 1}^k \sum_{i = 1}^n \Big[ \lambda_j h_j(x_i) - \lambda_j s_j \Big] \Big\} \de \bx \\ =&~ \int_{\R^k} \prod_{j \in [k]} \de s_j \int_{(\bi \R)^k} \prod_{j = 1}^k \de [\lambda_j/(2\pi)] \exp \Big\{ n \cdot \Big[ f(s_1, \ldots, s_k) - \sum_{j=1}^k \lambda_j s_j \Big] \Big\} \cdot \Big( \int \exp\Big\{ \sum_{j = 1}^n \lambda_j h_j(x) \Big\} \de x \Big)^n \\ =&~ \int_{\R^k} \prod_{j \in [k]} \de s_j \int_{(\bi \R)^k} \prod_{j = 1}^k \de [\lambda_j/(2\pi)] \exp \Big\{ n \cdot \Big[ f(s_1, \ldots, s_k) - \sum_{j=1}^k \lambda_j s_j + \log J(\lambda_1, \ldots, \lambda_k) \Big] \Big\},\\ \end{aligned} where $J(\lambda_1, \ldots, \lambda_k) = \int \exp\Big\{ \sum_{j = 1}^n \lambda_j h_j(x) \Big\} \de x.$ Using the method of steepest descent (my understanding is that, it is like the Laplace method but it deals with integration of complex functions), we have $\lim_{n \to \infty} n^{-1} \log I_n \stackrel{\cdot}{=} \ext_{s_j, \lambda_j} \Big[ f(s_1, \ldots, s_k) - \sum_{j=1}^k \lambda_j s_j + \log J(\lambda_1, \ldots, \lambda_k) \Big],$ where $\ext$ denotes the extremum operator.

#### 4.2.2. Application to this example

In this example, we need to deal with an integration of form Eq. \eqref{eq:3}. We take $f(s) = \xi s + s^2/4$ and $h(\bx) = \sum_{i = 1}^n x_i^2 / n$. Using the method introduced above, we have \begin{aligned} \E[\det(\bW_n - \xi \id_n)^{-1/2}] =&~ \int_{\R} \de s \int_{\R^{n}} \frac{1}{(2 \pi)^{n/2}} \exp\Big\{ \xi \Vert \bx \Vert_2^2 /2 + \Vert \bx \Vert_2^4 / (4n) \Big\} \cdot \delta(\Vert \bx \Vert_2^2 - n s) \de \bx \\ =&~ \int_{\R} \de s \int_{\R^{n}} \frac{1}{(2 \pi)^{n/2}} \exp \{ n \xi s /2 + n s^2 / 4 \} \cdot \delta(\Vert \bx \Vert_2^2 - n s) \de \bx \\ =&~ \int_{\R} \de s \int_{\bi R} [\de \lambda / (2 \pi)] \exp \Big\{ n \Big[\xi s /2 + s^2 / 4 - \lambda s \Big] \Big\} \cdot \Big( \int_R \frac{1}{(2 \pi)^{1/2}} \exp\{ \lambda x^2 \} \de x \Big)^n \\ =&~ \int_{\R} \de s \int_{\bi R} [\de \lambda / (2 \pi)] \exp \Big\{ n \Big[\xi s /2 + s^2 / 4 - \lambda s + J(\lambda) \Big] \Big\} \\ \end{aligned} where $J(\lambda) = \log \int_R \frac{1}{(2 \pi)^{1/2}} \exp\{ \lambda x^2\} \de x = - \frac{1}{2} \log ( - 2 \lambda).$ Therefore, we have $\lim_{n \to \infty} \frac{1}{n} \log \E[\det(\bW_n - \xi \id_n)^{-1/2}] = \ext_{s, \lambda} \Big[ \frac{ \xi s}{2} + \frac{s^2}{4} - \lambda s - \frac{1}{2} \log( - 2\lambda) \Big].$ Define $P(s, \lambda) = \frac{ \xi s}{2} + \frac{s^2}{4} - \lambda s - \frac{1}{2} \log( - 2 \lambda).$ Letting $\partial_\lambda P(s, \lambda) = 0$, we have $\lambda = - 1 / (2s)$, which gives $P(s) \equiv \ext_{\lambda} P(s, \lambda) = \frac{ \xi s}{2} + \frac{s^2}{4} + \frac{1}{2} \log(s) + \frac{1}{2}.$ Then we differentiate $P(s)$, its extremum $s_\star$ satisfies \begin{align}\tag{5}\label{eq:5} \xi + s_\star + \frac{1}{s_\star} = 0. \end{align} As a result, we get $\lim_{n \to \infty} \frac{1}{n} \log \E[\det(\bW_n - \xi \id_n)^{-1/2}] \stackrel{\cdot}{=} \frac{ \xi s_\star}{2} + \frac{s_\star^2}{4} + \frac{1}{2} \log(s_\star) + \frac{1}{2},$ and by Eq. \eqref{eq:2} we get $\lim_{n \to \infty} \frac{1}{n}\E[\log \det(\bW_n - \xi \id_n)] \stackrel{\cdot}{=} - \xi s_\star - \frac{s_\star^2}{2} - \log(s_\star) - 1.$ where $s_\star$ gives the solution of Eq. \eqref{eq:5}. Finally, by Eq. \eqref{eq:1}, we have \begin{aligned} \lim_{n \to \infty} \E[s_{\bW_n}(\xi)] \stackrel{\cdot}{=}&~ - \frac{\de}{\de \xi} \lim_{n \to \infty} \frac{1}{n} \E[ \log \det(\bW_n - \xi \id_n)] \\ =&~ - \frac{\de }{\de \xi} \Big[- \xi s_\star(\xi) - \frac{s_\star^2(\xi)}{2} - \log s_\star(\xi) - 1\Big] = - \frac{\partial }{\partial \xi} \Big[- \xi s - \frac{s^2}{2} - \log s - 1\Big] \Big\vert_{s = s_\star(\xi)} \\ =&~ s_\star(\xi). \end{aligned} Let $\xi \in \C_+$, and finding the solution $s_\star$ of Eq. \eqref{eq:5} with positive imaginary part, we get $s_\star(\xi) = \frac{- z + \sqrt{z^2 - 4}}{2}.$ This is exactly the limiting Stieltjes transform of the GOE matrix. Of course, to prove rigorously this formula, we need to adopt other techniques.

## 5. Summary

We showed a systematic approach for formally calculating the spectral density of random matrices. Here is one exercise for the readers: calculating the limiting Stieltjes transform of the empirical covariance matrix $\hat \Sigma_n$ of isotropic Gaussian data, where $\hat \Sigma_n = n^{-1} \sum_{i = 1}^n \bx_i \bx_i^\sT$ with $\bx_i \sim_{iid} \cN(0, \id_d)$, in the asymptotic regime $n / d \to \gamma$ as $d \to \infty$. The spectral density of the empirical covariance matrix, which can be derived from the exercise above, gives the famous Marchenko-Pastur law.

In my next post, I will introduce some applications of the replica method to machine learning.