/ Miscellaneous

Manifold Markov chain Monte Carlo methods in Python

Manifold Markov chain Monte Carlo methods in Python

Mici

Mici is a Python package providing implementations of Markov chain Monte Carlo (MCMC) methods for approximate inference in probabilistic models, with a particular focus on MCMC methods based on simulating Hamiltonian dynamics on a manifold.

Features

Key features include

  • implementations of MCMC methods for sampling from distributions on embedded
    manifolds implicitly-defined by a constraint equation and distributions on
    Riemannian manifolds with a user-specified metric,
  • a modular design allowing use of a wide range of inference algorithms by
    mixing and matching different components, making it easy for users to
    extend the package and use within their own code,
  • computationally efficient inference via transparent caching of the results
    of expensive operations and intermediate results calculated in derivative
    computations allowing later reuse without recalculation,
  • memory efficient inference for large models by memory-mapping chains to
    disk, allowing long runs on large models without hitting memory issues.

Installation

To install and use Mici the minimal requirements are a Python 3.6+ environment
with NumPy and SciPy
installed. The latest Mici release on PyPI (and its dependencies) can be
installed in the current Python environment by running

pip install mici

To instead install the latest development version from the master branch on Github run

pip install git+https://github.com/matt-graham/mici

If available in the installed Python environment the following additional
packages provide extra functionality and features

  • Autograd: if available Autograd will
    be used to automatically compute the required derivatives of the model
    functions (providing they are specified using functions from the
    autograd.numpy and autograd.scipy interfaces). To sample chains in
    parallel using autograd functions you also need to install
    multiprocess. This will
    cause multiprocess.Pool to be used in preference to the in-built
    mutiprocessing.Pool for parallelisation as multiprocess supports
    serialisation (via dill) of a much
    wider range of types, including of Autograd generated functions. Both
    Autograd and multiprocess can be installed alongside Mici by running pip install mici[autodiff].
  • RandomGen: if RandomGen is
    available the randomgen.Xorshift1024 random number generator will be used
    when running multiple chains in parallel, with the jump method of the
    object used to reproducibly generate independent substreams. RandomGen can
    be installed alongside Mici by running pip install mici[randomgen].
  • ArviZ: if ArviZ is
    available outputs of a sampling run can be converted to an
    arviz.InferenceData container object using
    mici.utils.convert_to_arviz_inference_data, allowing straightforward use
    of the extensive Arviz visualisation and diagnostic functionality.

Why Mici?

Mici is named for Augusta 'Mici'
Teller
, who along with
Arianna Rosenbluth
developed the code for the MANIAC I
computer used in the seminal paper Equations of State Calculations by Fast
Computing Machines
which introduced the
first example of a Markov chain Monte Carlo method.

Related projects

Other Python packages for performing MCMC inference include
PyMC3,
PyStan (the Python interface to
Stan), Pyro /
NumPyro, TensorFlow
Probability
,
emcee and
Sampyl.

Unlike PyMC3, PyStan, (Num)Pyro and TensorFlow Probability which are complete
probabilistic programming frameworks including functionality for definining a
probabilistic model / program, but like emcee and Sampyl, Mici is solely
focussed on providing implementations of inference algorithms, with the user
expected to be able to define at a minimum a function specifying the negative
log (unnormalised) density of the distribution of interest.

Further while PyStan, (Num)Pyro and TensorFlow Probability all push the
sampling loop into external compiled non-Python code, in Mici the sampling loop
is run directly within Python. This has the consequence that for small models
in which the negative log density of the target distribution and other model
functions are cheap to evaluate, the interpreter overhead in iterating over the
chains in Python can dominate the computational cost, making sampling much
slower than packages which outsource the sampling loop to a efficient compiled
implementation.

Overview of package

API documentation for the package is available
here. The three main user-facing
modules within the mici package are the systems, integrators and
samplers modules and you will generally need to create an instance of one
class from each module.

mici.systems -
Hamiltonian systems encapsulating model functions and their derivatives

  • EuclideanMetricSystem - systems with a metric on the position space with
    a constant matrix representation,
  • GaussianEuclideanMetricSystem - systems in which the target distribution
    is defined by a density with respect to the standard Gaussian measure on
    the position space allowing analytically solving for flow corresponding to
    the quadratic components of Hamiltonian
    (Shahbaba et al., 2014),
  • RiemannianMetricSystem - systems with a metric on the position space
    with a position-dependent matrix representation
    (Girolami and Calderhead, 2011),
  • SoftAbsRiemannianMetricSystem - system with SoftAbs
    eigenvalue-regularised Hessian of negative log target density as metric
    matrix representation (Betancourt, 2013),
  • DenseConstrainedEuclideanMetricSystem - Euclidean-metric system subject
    to holonomic constraints
    (Hartmann and Schütte, 2005;
    Brubaker, Salzmann and Urtasun, 2012;
    Lelièvre, Rousset and Stoltz, 2018)
    with a dense constraint function Jacobian matrix,

mici.integrators -
symplectic integrators for Hamiltonian dynamics

mici.samplers - MCMC
samplers for peforming inference

  • StaticMetropolisHMC - Static integration time Hamiltonian Monte Carlo
    with Metropolis accept step (Duane et al., 1987),
  • RandomMetropolisHMC - Random integration time Hamiltonian Monte Carlo
    with Metropolis accept step (Mackenzie, 1989),
  • DynamicMultinomialHMC - Dynamic integration time Hamiltonian Monte Carlo
    with multinomial sampling from trajectory
    (Hoffman and Gelman, 2014;
    Betancourt, 2017).

Example: sampling on a torus

torus-samples

A simple complete example of using the package to compute approximate samples
from a distribution on a two-dimensional torus embedded in a three-dimensional
space is given below. The computed samples are visualised in the animation
above. Here we use autograd to automatically construct functions to calculate
the required derivatives (gradient of negative log density of target
distribution and Jacobian of constraint function), sample four chains in
parallel using multiprocess and use matplotlib to plot the samples.

from mici import systems, integrators, samplers
import autograd.numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.animation as animation

# Define fixed model parameters
R = 1.0  # toroidal radius ∈ (0, ∞)
r = 0.5  # poloidal radius ∈ (0, R)
α = 0.9  # density fluctuation amplitude ∈ [0, 1)

# Define constraint function such that the set {q : constr(q) == 0} is a torus
def constr(q):
    x, y, z = q.T
    return np.stack([((x**2 + y**2)**0.5 - R)**2 + z**2 - r**2], -1)

# Define negative log density for the target distribution on torus
# (with respect to 2D 'area' measure for torus)
def neg_log_dens(q):
    x, y, z = q.T
    θ = np.arctan2(y, x)
    ϕ = np.arctan2(z, x / np.cos(θ) - R)
    return np.log1p(r * np.cos(ϕ) / R) - np.log1p(np.sin(4*θ) * np.cos(ϕ) * α)

# Specify constrained Hamiltonian system with default identity metric
system = systems.DenseConstrainedEuclideanMetricSystem(neg_log_dens, constr)

# System is constrained therefore use constrained leapfrog integrator
integrator = integrators.ConstrainedLeapfrogIntegrator(system, step_size=0.2)

# Seed a random number generator
rng = np.random.RandomState(seed=1234)

# Use dynamic integration-time HMC implementation as MCMC sampler
sampler = samplers.DynamicMultinomialHMC(system, integrator, rng)

# Sample initial positions on torus using parameterisation (θ, ϕ) ∈ [0, 2π)²
# x, y, z = (R + r * cos(ϕ)) * cos(θ), (R + r * cos(ϕ)) * sin(θ), r * sin(ϕ)
n_chain = 4
θ_init, ϕ_init = rng.uniform(0, 2 * np.pi, size=(2, n_chain))
q_init = np.stack([
    (R + r * np.cos(ϕ_init)) * np.cos(θ_init),
    (R + r * np.cos(ϕ_init)) * np.sin(θ_init),
    r * np.sin(ϕ_init)], -1)

# Define function to extract variables to trace during sampling
def trace_func(state):
    x, y, z = state.pos
    return {'x': x, 'y': y, 'z': z}

# Sample four chains of 2500 samples in parallel
final_states, traces, stats = sampler.sample_chains(
    n_sample=2500, init_states=q_init, n_process=4, trace_funcs=[trace_func])

# Print average accept probability and number of integrator steps per chain
for c in range(n_chain):
    print(f"Chain {c}:")
    print(f"  Average accept prob. = {stats['accept_prob'][c].mean():.2f}")
    print(f"  Average number steps = {stats['n_step'][c].mean():.1f}")

# Visualise concatentated chain samples as animated 3D scatter plot   
fig = plt.figure(figsize=(4, 4))
ax = Axes3D(fig, [0., 0., 1., 1.], proj_type='ortho')
points_3d, = ax.plot(*(np.concatenate(traces[k]) for k in 'xyz'), '.', ms=0.5)
ax.axis('off')
for set_lim in [ax.set_xlim, ax.set_ylim, ax.set_zlim]:
    set_lim((-1, 1))

def update(i):
    angle = 45 * (np.sin(2 * np.pi * i / 60) + 1)
    ax.view_init(elev=angle, azim=angle)
    return (points_3d,)

anim = animation.FuncAnimation(fig, update, frames=60, interval=100, blit=True)

GitHub