Markov Chain Monte Carlo

We have learned in the meantime, that the likelihood constrained prior sampling machinery is at the heart of any nested sampling (NS) algorithm. By far the most of the computation is spent for acquiring these samples. In this chapter I will give a detailed tour through the basics of Markov-chain Monte Carlo, which is more or less the only available method suitable of performing this task for high-dimensional spaces like those we encounter in atomistic systems.

We will cover the necessary theoretical foundations and also dive into the computational details required to run this procedure efficiently on parallel hardware. For this chapter it is not required to have a deeper understanding of what a prior actually is. We will solely focus on the machinery that we need to sample from a given distribution with an additional constraint and hopefully the bulky term likelihood constrained prior sampling will become slightly demystified!

Metropolis-Hastings Monte Carlo

The most fundamental MCMC algorithm is the Metropolis-Hastings (MH) algorithm, introduced by Metropolis et al. in 1953. It generates a sequence of samples from a target distribution by proposing moves and accepting or rejecting them based on a simple criterion.

The Metropolis-Hastings algorithm:

Start at a position
For each step :
- Propose a new state , where is drawn from a proposal distribution (e.g. a Gaussian with standard deviation )
- Compute the acceptance probability:
- Draw : if , set , else

When the target distribution is the Boltzmann distribution , the acceptance criterion simplifies to

This means that moves to lower energy are always accepted, while moves to higher energy are accepted with a probability that decreases exponentially with the energy difference. The result is a random walk that explores the energy landscape, spending more time in low-energy regions.

In the visualization below, you can click anywhere on the energy landscape to start an MH random walk. Green dots indicate accepted positions, while red crosses mark rejected proposals. The dashed red lines connect each rejected proposal back to the position the walker was at when the proposal was made.

Try clicking at different locations and observe how the walker explores the landscape. Then experiment with the step size — you should notice a fundamental trade-off:

Small : Nearly all proposals are accepted, but the walker takes tiny steps and explores very slowly.
Large : The walker attempts large jumps, but most proposals land in high-energy regions and get rejected. The walker gets stuck.

In practice, the step size is tuned to achieve an acceptance rate of roughly 20–50%, which balances exploration speed against wasted proposals.

Constrained sampling for nested sampling

In the primer, we saw that at each NS iteration we need to draw a new walker from the region where . The MH algorithm is perfectly suited for this: we simply modify the acceptance criterion to enforce a hard energy constraint:

This is exactly the NS acceptance criterion we encountered before — accept any move that stays within the allowed region, reject everything outside. No Boltzmann weighting is needed because the target distribution within the constraint is uniform.

The visualization below shows an MH walk with a hard constraint at . The white shaded region marks the forbidden area where . Click inside the allowed region to start a constrained walk.

Observe how the walker bounces off the constraint boundary, remaining trapped inside the allowed region. This is the core sampling engine that powers NS.

Galilean Monte Carlo

The standard MH walk proposes random displacements in all directions. In a constrained region with complex geometry, many of these proposals will land outside the boundary and get rejected. Galilean Monte Carlo (GMC) addresses this by replacing the random walk with a ballistic trajectory — the walker moves in a straight line with constant speed, and reflects off the constraint boundary like a billiard ball.

The Galilean Monte Carlo algorithm:

Choose a random velocity direction and set with speed
For each step:
- Propose
- If : accept and move to
- Else: reflect off the iso-energy surface using , where

The key insight is that between reflections, the walker travels in straight lines. This allows it to traverse long distances through the allowed region without wasting proposals. At the boundary, the velocity is reflected using the gradient of the energy as the surface normal — exactly like light reflecting off a mirror.

In the visualization below, click inside the constraint boundary to launch a Galilean walk. Orange dots mark reflection points.

Compared to MH, notice how the Galilean walker covers much more of the allowed region in the same number of steps. The ballistic motion carries the walker efficiently through long, narrow corridors that would be difficult for a random walk to traverse. However, GMC requires computing the gradient at reflection points, which adds computational cost per step. In practice, GMC (as implemented in tools like pymatnest) is often the method of choice for NS in atomistic simulations.

Hamiltonian Monte Carlo

Hamiltonian Monte Carlo (HMC) takes a fundamentally different approach. Instead of proposing random displacements, HMC introduces an auxiliary momentum variable and simulates Hamiltonian dynamics on the augmented phase space .

The idea is to treat the energy as a potential energy and introduce a kinetic energy . The total Hamiltonian is

and Hamilton's equations give us a deterministic trajectory through phase space:

These equations describe a ball rolling on the energy surface: it accelerates downhill, decelerates uphill, and follows the contours of the landscape. In flat regions it moves in straight lines; near minima it oscillates back and forth.

The HMC algorithm:

At current position , draw a random momentum
Simulate Hamiltonian dynamics for steps using the leapfrog integrator with time step
Accept the endpoint with probability
Repeat from step 1

The leapfrog integrator is a symplectic integrator that preserves the Hamiltonian approximately. If the time step is small enough, and nearly all proposals are accepted — even when they land far from the starting point. This is the key advantage of HMC: it can make large moves with high acceptance probability.

Click on the landscape below to start an HMC chain. Solid green lines show accepted leapfrog trajectories, dashed red lines show rejected ones. Yellow dots mark the sample positions.

Experiment with the parameters and notice the following:

Small : Trajectories closely follow the energy contours, acceptance is high, but the walker doesn't move far per proposal.
Large : The leapfrog integrator becomes inaccurate, grows, and proposals get rejected (red dashed lines).
More leapfrog steps : Longer trajectories that explore further before proposing. The sweet spot depends on the curvature of the energy landscape.

The key difference to MH is that HMC proposals follow the geometry of the energy landscape rather than jumping blindly. This makes HMC particularly effective in high-dimensional spaces, where random-walk MH becomes hopelessly inefficient.

Constrained HMC for nested sampling

Just like MH, HMC can be adapted for likelihood-constrained prior sampling by adding a hard energy check: any leapfrog trajectory that ends at is rejected, regardless of the Hamiltonian acceptance criterion. The result is a sampler that combines the geometric efficiency of HMC with the hard constraint required by NS.

Click inside the constraint boundary below to launch a constrained HMC chain. Notice how the leapfrog trajectories that exit the allowed region (dashed red) are rejected, while those that stay inside (solid green) are accepted.

Total-energy HMC

A common problem with constrained HMC is that the Gaussian momentum draw can give the walker enough kinetic energy to fly far beyond the energy limit, leading to many rejected proposals. Total-energy HMC (TE-HMC) addresses this by capping the initial kinetic energy:

so that the total Hamiltonian satisfies from the start. Since the leapfrog integrator approximately conserves , the trajectory is much more likely to remain inside the constraint boundary. The momentum direction is still chosen uniformly at random; only its magnitude is bounded.

Compare TE-HMC (below) with the standard constrained HMC above — you should see significantly fewer rejected trajectories, especially when the walker sits deep inside the allowed region.

Summary

We have now surveyed the main MCMC methods used for likelihood-constrained prior sampling in nested sampling:

Method	Proposal	Gradient?	Key advantage
MH	Random displacement	No	Simple, general-purpose
Galilean MC	Ballistic + reflection	Yes (at boundary)	Efficient in narrow regions
HMC	Leapfrog dynamics	Yes (every step)	Large moves, high acceptance
TE-HMC	Leapfrog, capped K	Yes (every step)	Stays within constraint

Each method trades off simplicity against efficiency. In low dimensions, simple MH works well. In high-dimensional atomistic systems, gradient-based methods (GMC, HMC) become essential — the "curse of dimensionality" makes random-walk proposals exponentially unlikely to be accepted.

You might have noticed that the leapfrog trajectories in HMC look a lot like molecular dynamics (MD) simulations — and indeed they are! HMC is essentially MD with an accept/reject step to correct for integration errors. We will explore the connection between MCMC and MD in more detail in a dedicated chapter.