## PCMI (1)July 1, 2010

Posted by Sarah in Uncategorized.
Tags: , , , , , First, let me get it out of the way: Utah is beautiful. I’ve wanted to go to Utah ever since I read A Primer on Mapping Class Groups and realized that there are people doing cool math out west. It’s amazing up here. Liquid blue sky, sagebrush on the mountains, columbines and lupines and pine martens…
It’s actually so pretty that it tempts me to get too ambitious with the trail running.

Highlights:

Jared Tanner’s course on compressed sensing is, as far as anyone knows, the first set of lectures on this for an undergraduate audience. We showed that sparse recovery is possible for matrices in general position (checking this property is impractical, as it takes exponential time, but it happens that random matrices are in general position with high probability.) We went on to deal with Haar and Fourier bases, and we’re now studying basis pursuit and coherence.

Richard Baraniuk gave some graduate lectures on compressed sensing, from more of an application perspective. He gave a visual illustration that I really liked of why l^1 minimization is a better way to find sparsity than l^2 minimization — the l^1 ball is an octahedron, and it’s likely to intersect a line of random angle close to a coordinate axis. The l^2 ball is a sphere, which is less “concentrated” around the axes, and so is likely to intersect a line of random angle farther from a coordinate axis — that is, less sparse.

I’m now starting a course with Anna Gilbert about sparse approximation — she’s going to take more of the computational complexity approach.

The other nice thing about PCMI is meeting people you’ve only heard of through their research. I went hiking (and then pizza-eating) with Arthur Szlam, whom I knew about because he got his PhD from Yale and does the same kind of computational harmonic analysis that I’d (ideally) like to do. Talking to him was great — he runs at the prodigious rate of about three big mathematical insights per beer.

## Random Projections for Manifold LearningJune 11, 2010

Posted by Sarah in Uncategorized.
Tags: , , ,

Interesting paper by Chinmay Hegde, Michael Wakin, and Richard Baraniuk on manifold learning.
Given a set of data points which are suspected to lie on a low-dimensional manifold, how do you detect the dimension of the manifold? This method uses projection onto a random subspace of dimension $M = CK \log(N)$ if the high-dimensional space is in $\mathbb{R}^N$ and $K$ is the true dimension of the manifold. The pairwise metric structure of the points turns out to be preserved with high accuracy. What this means is that we can do signal processing directly on the sorts of random measurements produced by devices that exploit compressed sensing (like the single-pixel camera, for instance.)

This work relies on the Grassberger-Procaccia algorithm. Given the pairwise distances between data points, this computes the scale-dependent correlation dimension of the data, $\hat{D}(r_1, r_2) = \frac{\log C_n(r_1) - \log C_n(r_2)}{\log r_1 - \log r_2}$
where $C_n(r) = \frac{1}{n(n-1)} \sum_{i \neq j} I_{||x_i - x_j|| < r}$
We can approximate K by fixing $r_1$ and $r_2$ to be the biggest range over which the plot is linear and calculating $D_{corr}$ in that range.

Once K is estimated, there are various techniques to generate a K-dimensional representation of the data points. The key theorem here is that if the manifold has dimension K, volume V, and condition number $1/\tau$, then fixing $0 < \epsilon < 1$ and $0 < \rho < 1$ and letting $\Phi$ be a random orthoprojector from $\mathbb{R}^N \to \mathbb{R}^M$ and $M \ge O(\frac{K \log N V/ \tau log (\rho^{-1})}{\epsilon^2}$

then with probability greater than $1 - \rho$, $(1- \epsilon)\sqrt{M/N} \le \frac{d_i(\Phi_x, \Phi_y)}{d_i(x, y)} \le (1 + \epsilon)\sqrt{M/N}$

That is, this map almost preserves geodesic distances.

A priori, though, we have no way to estimate V and $\tau$ of the manifold — we don't know how big the curvature or volume are — so we need a learning procedure. Start with a small M, and estimate the dimension with Grassberger-Procaccia. Then use Isomap to identify a manifold of that dimension, using random projections. Then we calculate the residual variance (a measure of error) and increment M upwards until the residual variance is small enough for our own arbitrary criterion.