# Getting Started With pymcmcstat¶

Author(s): Paul Miles | Date Created: August 31, 2018

# Introduction¶

The pymcmcstat package is a Python program for running Markov Chain Monte Carlo (MCMC) simulations. Included in this package is the abilitity to use different Metropolis based sampling techniques:

• Metropolis-Hastings (MH): Primary sampling method.
• Delayed-Rejection (DR): Delays rejection by sampling from a narrower distribution. Capable of n-stage delayed rejection.
• Delayed Rejection Adaptive Metropolis (DRAM): DR + AM

The pymcmcstat package is a Python implementation of the MATLAB toolbox mcmcstat. The user interface is designed to be as similar to the MATLAB version as possible, but this implementation has taken advantage of certain data structure concepts more amenable to Python.

Please see the pymcmcstat homepage for more details about the development of this Python package.

# Installation¶

This code can be found on the Github project page. The package is available on the PyPI distribution site and the latest version can be installed via,

pip install pymcmcstat


The master branch on Github typically matches the latest release on the PyPI distribution site. To install the master branch directly from Github,

pip install git+https://github.com/prmiles/pymcmcstat.git


You can also clone the repository and run python setup.py install.

# General Examples¶

There are many built-in features to pymcmcstat that allow it to be tailored to suit your particular problem. Below we have outlined features through a set of examples.

## Monod¶

Key Features:

• Basic MCMC settings
• Data structure initialization
• Constructing initial parameter covariance matrix using scipy.optimize.leastsq.
• Chain/Pairwise-correlation panels.
• Credible interval generation and plotting.

## Beetle¶

Key Features:

• Sending objects within MCMC data structure.
• Managing objects within sum-of-squares evaluation.
• Chain/Pairwise-correlation panels.
• Credible interval generation and plotting.

## Banana¶

Key Features:

• Sending class objects in MCMC data structure.
• Defining parameter covariance matrix.
• Pairwise correlation and generation of ellipse contours.

## Algae¶

Key Features:

• Using multiple data sets.
• Solving system of ODE's as model response.
• Chain/Density/Pairwise-correlation panels.
• Generating prediction/credible intervals for multiple quantities of interest.

## Viscoelasticity¶

Key Features:

• Loading data from *.mat file.
• Calling C++ model using ctypes packages.
• Specifying model parameters to be included in the sampling chain.
• Plotting prediction/credible intervals with respect to time or deformation.

## Landau Energy¶

Key Features:

• Evaluating multidimensional functions (3-D polarization space).
• Loading data from *.mat file.
• Specifying model parameters to be included in the sampling chain.
• Specifying number of observations.
• Enhanced visualization using mcmcplot.
• Plotting prediction/credible intervals.

Key Features:

• Embedding user defined objects in the data structure.
• Enhanced visualization using mcmcplot.
• Specifying model parameters to be included in the sampling chain.

## Running Parallel Chains¶

Key Features:

• Running multiple chains simultaneously.
• Using Gelman-Rubin chain diagnostics.
• Enhanced visualization using mcmcplot.

These tutorials address very specific features of using the package.

## Using Chain Log Files¶

Key Features:

• Saving chain logs in binary and text formats.
• Assessing log history to ascertain status of simulation.

## Restarting Simulations From Chain Log Files¶

Key Features:

• Saving chain logs in text format and meta data to .json.

## Setting the RNG Seed¶

Key Features:

• Set seed for random number generator within pymcmcstat.
• Produce repeatable simulation results.

## Calling Models Written in C++¶

Key Features:

• Call arbitrarily complex models written in other languages (e.g., C++) using the ctypes package.
• Generating credible/prediction intervals using C++ based model.

## Specifying Sample Variables¶

Key Features:

• Specify which model parameters should be included in sampling chain.

## Estimating Error Variance for Multiple Data Sets¶

Key Features:

• Setting up multiple data sets in the MCMC data structure.
• Defining sum-of-squares function to accomodate multiple data sets.
• Estimating a separate observation error variance for each data set.
• Plotting prediction/credible intervals for each data set.

## Using Normal Prior Distributions¶

Key Features:

• Enforcing normally distributed prior functions.
• Defining non-linear parameter constraints via custom prior functions.