Church: a language for probabilistic computation October 23, 2010

Posted by Sarah in Uncategorized.
Tags: ,

As a novice with some interest in applied areas, it happens all too often that I fall in love with an idea about perception/cognition/data analysis, and find out only later “Oh wait, that wasn’t actually math! And I’d have to do so much background reading in a completely different field to understand it!” I’ve experienced “Oh wait, I’m not an electrical engineer!”, “Oh wait, I’m not a neuroscientist!”, and “Oh wait, I’m not a statistician!”

Well, today is the day I say “Oh wait, I’m not a computer scientist!”

The idea that caught my attention is probabilistic computing. We often want to build a machine that can make predictions and build models (a computer that can diagnose medical symptoms, or predict which creditors will default, or even, dare we whisper, an AI). This is essentially a Bayesian task: given some data, which probability functions best explain it? The trouble is, computers are bad at this. For the most part, they’re built to do the opposite task: given probability distributions and models, simulate some data. Generating probability functions and finding the best one can be prohibitively expensive, because the space of probability functions is so large. Also, while the computational complexity of evaluating f(g(x)) is just f + g, the computational complexity of composing two conditional probability distributions B|A and C|B is

ΣB P(C, B|A)

whose computational time will grow exponentially rather than linearly as we compose more distributions.

Church, a language developed at MIT is an attempt to solve this problem. (Apparently it’s a practical attempt, because the founders have already started a company, Navia Systems, using this structure to build probabilistic computers.) The idea is, instead of describing a probability distribution as a deterministic procedure that evaluates the probabilities of different events, represent them in terms of probabilistic procedures for generating samples from them. That is, a random variable is actually a random variable. This means that repeating a computation will not give the same result each time, because evaluating a random variable doesn’t give the same result each time. There’s a computational advantage here because it’s possible to compose random variables without summing over all possible values.

The nice thing about Church (which is based on Lisp, and named after Alonzo Church) is that it allows you to compose practically any query without significantly increasing runtime. “What is the probability of A and B or C and not D given that X and Y and not Z?” and so on.

The PhD dissertation that introduced Church is where I started, but it’s actually much clearer if you learn about it from the interactive tutorial. The tutorial is really a beautiful thing, well-written to the point of addictiveness. It is completely accessible to just about anyone.

1. Yaroslav Bulatov - October 23, 2010

Interesting idea. From skimming of the first 30 pages of thesis, it seems the main idea is that we should use MCMC sampling instead of regular probability calculus, am I right?

I do wonder if MCMC is more efficient than deterministic approach. For instance, recent results suggest that when partition function of binary MRF’s can’t be deterministically approximated in polynomial time, MCMC fails to mix in polynomial time (http://arxiv.org/abs/1005.5584 and refs in the intro)

2. Sarah - October 24, 2010

Yeah, I think you’re right: MCMC is how they estimate the posterior probabilities in a reasonable time. I don’t know how efficient they are (but the authors also suggest making changes in hardware, replacing logic gates with stochastic logic gates, so maybe that would help?) I’m afraid I’m brand new to this, though.