[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
On the Ergodicity requirement of MCMC (and when it gets broken..)
- To: Rao Kambhampati <rao@asu.edu>
- Subject: On the Ergodicity requirement of MCMC (and when it gets broken..)
- From: Subbarao Kambhampati <rao@asu.edu>
- Date: Wed, 14 Nov 2012 19:53:57 -0700
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:from:date:x-google-sender-auth:message-id :subject:to:content-type; bh=n8rR8GZ7sHm+jg+CyRcfCvy0+rfwoAFjHu20mpOEvmY=; b=g47XHCTNDMBHXAZd1204m1OHmWkdDaev//Is8qkFV1oIzi/7XcbQdo8VR5rE12/Qmm 9cas0+eMkW7MzQOAqkr8y8vK1tmL7sbHii8LzCSDyBj9kOb0Wbq55hmLrOHQO9unOqmU XNgzHweI1+84V44h4UIBMkp02vijMjY5x+HjRXCP8AIltxIajEdH9jrWD3dtMDMKNkQ5 1qE9siz543J9C+/YhS74Hi4hEBAvx78C0Hj8RIJLSlSEHMY7Z/1h+ZIkVKiKKsJsXuOI f5+dbdjlimcCFqAqv8dwgZdIXCcxF1boVffUufeLsweMXPT/L7KAfrAIrqAtyVhKOAZS k+Pg==
- Sender: subbarao2z2@gmail.com
I mentioned in the class that MCMC approaches work only if the underlying markov chain is ergodic (that given any pair of states, there is non-zero probability that you can reach one from the other).
Note that the reachability here is in the space of "states of the network" (i.e., complete assignments of values to variables). Thus a disconnected Bayes Net doesn't mean that the underlying MC is non-ergodic. In fact, if you have K variables that are all independent, there is no a priori reason that you can't go from any of the 2^K states to any other one with non-zero probability(*).
This begs the question "exactly when do networks become non-ergodic?". If you go back to the example above, and assume that one of the variables has the probability of being true equal to 1 (or 0). Then all of a sudden you are no longer guaranteed that the probability of going from any state where that variable is True to a state where it is false is going to be non-zero (in this case, it will be zero).
In other words, networks become non-ergodic exactly when their CPTs are deterministic (i.e., have 1s and 0s in them). [An aside for those you didn't quite realize that you can encode deterministic dependencies in Bayes Nets--but of course you can, just by setting CPTs to zeros and ones; that is actually how you reduced 3SAT to a BN in the NP-completeness proof ;)
Sampling based methods in general, and MCMC in particular, will fail to converge in such scenarios because the underlying markov chain will not be irreducible (has separate components), and thus won't have a unique steady state distribution.
The irony of it is that if the bayes network has only 1/0 CPTs, then it is really a logical theory (rather than a probabilistic one) and as such should be easier to reason with! And yet, the sampling based methods don't like any deterministic CPTs.
Of course, you can do the usual tricks and *soften* the deterministic dependencies. Make a 1 to be 0.999999 or a 0 to be 10^-5 (for the page-rank aficionados, this is equivalent to introducing fake low probability links in the page transition matrix). This is what is done in practice, but you do look quite silly gratuitously converting known logical dependencies to probabilistic ones. Further more, while the chain does become ergodic with this trick, its mixing time (i.e., time needed for the chain to reach its steady state probabilities) will be inversely proportional to the smallest transition probabilities you have in the chain. In other words, you need to let the chain run for a much longer time before you can be sure that you reached steady state (and start using the samples to estimate the needed posteriors).
Short summary--deterministic dependencies, while highly desirable in general--pose problems to MCMC methods as they make the markov chain non-ergodic.
Rao
ps: I will post this also on the forum and would be interested in your comments and questions.