Markov Chain Monte Carlo
We have learned in the meantime, that the likelihood constrained prior sampling machinery is at the heart of any nested sampling (NS) algorithm. By far the most of the computation is spent for acquiring these samples. In this chapter I will give a detailed tour through the basics of Markov-chain Monte Carlo, which is more or less the only available method suitable of performing this task for high-dimensional spaces like those we encounter in atomistic systems.
We will cover the necessary theoretical foundations and also dive into the computational details required to run this procedure efficiently on parallel hardware. For this chapter it is not required to have a deeper understanding of what a prior actually is. We will solely focus on the machinery that we need to sample from a given distribution with an additional constraint and hopefully the bulky term likelihood constrained prior sampling will become slightly demystified!
Metropolis-Hastings Monte Carlo
The most fundamental MCMC algorithm is the Metropolis-Hastings (MH) algorithm, introduced by Metropolis et al. in 1953. It generates a sequence of samples from a target distribution
The Metropolis-Hastings algorithm:
- Start at a position
- For each step
: - Propose a new state
, where is drawn from a proposal distribution (e.g. a Gaussian with standard deviation ) - Compute the acceptance probability:
- Draw
: if , set , else
- Propose a new state
When the target distribution is the Boltzmann distribution
This means that moves to lower energy are always accepted, while moves to higher energy are accepted with a probability that decreases exponentially with the energy difference. The result is a random walk that explores the energy landscape, spending more time in low-energy regions.
In the visualization below, you can click anywhere on the energy landscape to start an MH random walk. Green dots indicate accepted positions, while red crosses mark rejected proposals. The dashed red lines connect each rejected proposal back to the position the walker was at when the proposal was made.
Try clicking at different locations and observe how the walker explores the landscape. Then experiment with the step size
- Small
: Nearly all proposals are accepted, but the walker takes tiny steps and explores very slowly. - Large
: The walker attempts large jumps, but most proposals land in high-energy regions and get rejected. The walker gets stuck.
In practice, the step size is tuned to achieve an acceptance rate of roughly 20–50%, which balances exploration speed against wasted proposals.
Constrained sampling for nested sampling
In the primer, we saw that at each NS iteration we need to draw a new walker from the region where
This is exactly the NS acceptance criterion we encountered before — accept any move that stays within the allowed region, reject everything outside. No Boltzmann weighting is needed because the target distribution within the constraint is uniform.
The visualization below shows an MH walk with a hard constraint at
Observe how the walker bounces off the constraint boundary, remaining trapped inside the allowed region. This is the core sampling engine that powers NS.
Galilean Monte Carlo
The standard MH walk proposes random displacements in all directions. In a constrained region with complex geometry, many of these proposals will land outside the boundary and get rejected. Galilean Monte Carlo (GMC) addresses this by replacing the random walk with a ballistic trajectory — the walker moves in a straight line with constant speed, and reflects off the constraint boundary like a billiard ball.
The Galilean Monte Carlo algorithm:
- Choose a random velocity direction
and set with speed - For each step:
- Propose
- If
: accept and move to - Else: reflect
off the iso-energy surface using , where
- Propose
The key insight is that between reflections, the walker travels in straight lines. This allows it to traverse long distances through the allowed region without wasting proposals. At the boundary, the velocity is reflected using the gradient of the energy as the surface normal — exactly like light reflecting off a mirror.
In the visualization below, click inside the constraint boundary to launch a Galilean walk. Orange dots mark reflection points.
Compared to MH, notice how the Galilean walker covers much more of the allowed region in the same number of steps. The ballistic motion carries the walker efficiently through long, narrow corridors that would be difficult for a random walk to traverse. However, GMC requires computing the gradient
Hamiltonian Monte Carlo
Hamiltonian Monte Carlo (HMC) takes a fundamentally different approach. Instead of proposing random displacements, HMC introduces an auxiliary momentum variable
The idea is to treat the energy
and Hamilton's equations give us a deterministic trajectory through phase space:
These equations describe a ball rolling on the energy surface: it accelerates downhill, decelerates uphill, and follows the contours of the landscape. In flat regions it moves in straight lines; near minima it oscillates back and forth.
The HMC algorithm:
- At current position
, draw a random momentum - Simulate Hamiltonian dynamics for
steps using the leapfrog integrator with time step - Accept the endpoint
with probability - Repeat from step 1
The leapfrog integrator is a symplectic integrator that preserves the Hamiltonian approximately. If the time step is small enough,
Click on the landscape below to start an HMC chain. Solid green lines show accepted leapfrog trajectories, dashed red lines show rejected ones. Yellow dots mark the sample positions.
Experiment with the parameters and notice the following:
- Small
: Trajectories closely follow the energy contours, acceptance is high, but the walker doesn't move far per proposal. - Large
: The leapfrog integrator becomes inaccurate, grows, and proposals get rejected (red dashed lines). - More leapfrog steps
: Longer trajectories that explore further before proposing. The sweet spot depends on the curvature of the energy landscape.
The key difference to MH is that HMC proposals follow the geometry of the energy landscape rather than jumping blindly. This makes HMC particularly effective in high-dimensional spaces, where random-walk MH becomes hopelessly inefficient.
Constrained HMC for nested sampling
Just like MH, HMC can be adapted for likelihood-constrained prior sampling by adding a hard energy check: any leapfrog trajectory that ends at
Click inside the constraint boundary below to launch a constrained HMC chain. Notice how the leapfrog trajectories that exit the allowed region (dashed red) are rejected, while those that stay inside (solid green) are accepted.
Total-energy HMC
A common problem with constrained HMC is that the Gaussian momentum draw can give the walker enough kinetic energy to fly far beyond the energy limit, leading to many rejected proposals. Total-energy HMC (TE-HMC) addresses this by capping the initial kinetic energy:
so that the total Hamiltonian satisfies
Compare TE-HMC (below) with the standard constrained HMC above — you should see significantly fewer rejected trajectories, especially when the walker sits deep inside the allowed region.
Summary
We have now surveyed the main MCMC methods used for likelihood-constrained prior sampling in nested sampling:
| Method | Proposal | Gradient? | Key advantage |
|---|---|---|---|
| MH | Random displacement | No | Simple, general-purpose |
| Galilean MC | Ballistic + reflection | Yes (at boundary) | Efficient in narrow regions |
| HMC | Leapfrog dynamics | Yes (every step) | Large moves, high acceptance |
| TE-HMC | Leapfrog, capped K | Yes (every step) | Stays within constraint |
Each method trades off simplicity against efficiency. In low dimensions, simple MH works well. In high-dimensional atomistic systems, gradient-based methods (GMC, HMC) become essential — the "curse of dimensionality" makes random-walk proposals exponentially unlikely to be accepted.
You might have noticed that the leapfrog trajectories in HMC look a lot like molecular dynamics (MD) simulations — and indeed they are! HMC is essentially MD with an accept/reject step to correct for integration errors. We will explore the connection between MCMC and MD in more detail in a dedicated chapter.