Hamiltonian MonteCarlo Method

Let me summarize the basic idea of the Hamiltonian MonteCarlo Method here. The idea is also from the famous Japanese text book on Bayesian modeling, so called "another Green book".

Joint distribution of posterior distribution and proposal distribution

Let \(f(\theta|x)\) be the posterior distribution and \(f(p)\) be the proposal distribution. Since they are assumed to be mutually independent, then the joint distribution of them must have the equality: \begin{align} f(\theta,p|x) = f(\theta|x)f(p). \label{eq:jdhm} \end{align} Suppose \(f(p)\) follows standard normal distribution: \begin{align} f(p) \propto \exp \left(-\frac{p^2}{2} \right). \nonumber \end{align}

Derivation

Recall that the Hamiltonian is expressible as \(\displaystyle H(\theta)=h(\theta)+\frac{1}{2}p^2\) from the previous article. We can derive Eq.(\ref{eq:jdhm}) as: \begin{align} f(\theta,p|x) &=\exp \Bigl( \log f(\theta,p|x) \Bigr) \nonumber \\ &=\exp \Bigl( \log f(\theta|x) + \log f(p) \Bigr) \nonumber \\ &\propto \exp \Bigl( \log f(\theta|x) - \frac{p^2}{2} \Bigr) \nonumber \end{align} Also recall that we consider the height as negative log-likelihood (in the previous article), i.e., \(h(\theta)=-\log f(\theta|x)\), then \begin{align} f(\theta,p|x) &\propto \exp \Bigl( - h(\theta) - \frac{p^2}{2} \Bigr) \nonumber \\ &= \exp \Bigl( -H(\theta,p)\Bigr) \nonumber \end{align}

Detailed Balance Condition

When we apply the Detailed Balance Condition to the joint distribution \(f(\theta,p|x)\), we have the following equalities from the assumption of being constant of Hamiltonian: \begin{align} f(\theta',p'|\theta,p) &= f(\theta,p|\theta',p'), \nonumber \\ f(\theta,p|x) &= f(\theta',p'|x).\nonumber \end{align} Then, we have the Detailed Balance Condition as follows: \begin{align} f(\theta',p'|\theta,p)f(\theta,p|x) = f(\theta,p|\theta',p')f(\theta',p'|x) \nonumber \end{align} Consider the probability of acceptance for updating \(r\), \begin{align} r = \frac{f(\theta^{(a)},p^{(a)}|x)}{f(\theta^{(t)},p^{(t)}|x)}=\exp \Bigl( H(\theta^{(t)},p^{(t)})-H(\theta^{(a)},p^{(a)}) \Bigr) \approx 1 \nonumber \end{align} There can be only error from the numerical approximation, then the acceptance rate is always high (almost 1).

Hamiltonian MonteCarlo Method

Iterate the following procedure:
  • Distribute a sample \(p^{(t)}\) from the standard normal distribution.
  • Get a candidate sample \((\theta^{(a)},p^{(a)})\) by Leap-frog method.
  • With probability \(\min(1,r)\), update \((\theta^{(t+1)}=\theta^{(a)})\), otherwise \((\theta^{(t+1)}=\theta^{(t)})\)

Comments