## Random Projections for Manifold LearningJune 11, 2010

Posted by Sarah in Uncategorized.
Tags: , , ,

Interesting paper by Chinmay Hegde, Michael Wakin, and Richard Baraniuk on manifold learning.
Given a set of data points which are suspected to lie on a low-dimensional manifold, how do you detect the dimension of the manifold? This method uses projection onto a random subspace of dimension $M = CK \log(N)$ if the high-dimensional space is in $\mathbb{R}^N$ and $K$ is the true dimension of the manifold. The pairwise metric structure of the points turns out to be preserved with high accuracy. What this means is that we can do signal processing directly on the sorts of random measurements produced by devices that exploit compressed sensing (like the single-pixel camera, for instance.)

This work relies on the Grassberger-Procaccia algorithm. Given the pairwise distances between data points, this computes the scale-dependent correlation dimension of the data, $\hat{D}(r_1, r_2) = \frac{\log C_n(r_1) - \log C_n(r_2)}{\log r_1 - \log r_2}$
where $C_n(r) = \frac{1}{n(n-1)} \sum_{i \neq j} I_{||x_i - x_j|| < r}$
We can approximate K by fixing $r_1$ and $r_2$ to be the biggest range over which the plot is linear and calculating $D_{corr}$ in that range.

Once K is estimated, there are various techniques to generate a K-dimensional representation of the data points. The key theorem here is that if the manifold has dimension K, volume V, and condition number $1/\tau$, then fixing $0 < \epsilon < 1$ and $0 < \rho < 1$ and letting $\Phi$ be a random orthoprojector from $\mathbb{R}^N \to \mathbb{R}^M$ and $M \ge O(\frac{K \log N V/ \tau log (\rho^{-1})}{\epsilon^2}$

then with probability greater than $1 - \rho$, $(1- \epsilon)\sqrt{M/N} \le \frac{d_i(\Phi_x, \Phi_y)}{d_i(x, y)} \le (1 + \epsilon)\sqrt{M/N}$

That is, this map almost preserves geodesic distances.

A priori, though, we have no way to estimate V and $\tau$ of the manifold — we don't know how big the curvature or volume are — so we need a learning procedure. Start with a small M, and estimate the dimension with Grassberger-Procaccia. Then use Isomap to identify a manifold of that dimension, using random projections. Then we calculate the residual variance (a measure of error) and increment M upwards until the residual variance is small enough for our own arbitrary criterion.