Week 08

MCMC and Convergence diagnostics | Gibbs sampling | Change point detection

Markov Chain Monte Carlo methods... in a nutshell

Goal: estimate the expectation

[Definition] Monte-Carlo estimator, a random variable

HOW TO OBTAIN I.I.D SAMPLES FROM ANY DISTRIBUTION?

Markov Chain Monte Carlo [MCMC]
Design a Markov chain such that the target distribution is invariant wrt. the chain

In other words, we create a specific random walk that explore the space of the target distribution, where high density regions are visited more frequently

Convergence diagnosis for mcmc

Goal: quantify the error of MCMC methods to evaluate approximation accuracy

Potential scale reduction

Let's run multiple chains with different initial conditions, then compare them after K iterations.

If they reached the stationary distribution, they should be equal.

We define B as the between-chain variance, W as the withing-chain variance and N the chain length:

If B = W, Rhat = 1 | If B > W, Rhat > 1 | If Rhat < 1.1, the chains have mixed

Running 4 chains with different initial conditions: trace and histogram visualizations after different numbers of iterations
Target = 0.5 * [ N(-3, 4) + N(1, 2) ]

Effective Sample Size (ESS)

In MCMC, samples are highly correlated → take this correlation into account

Monte Carlo Standard Error (MCSE)

Estimate the variance based on samples, then quantify the Monte Carlo error
We use the ESS instead of S

Gibbs Sampling

Motivation

With Metropolis-Hastings algorithm:

Have to choose and tune a proposal distribution
Acceptance ratio (rate of accepted proposed candidates) can be low sometimes

GIBBS SAMPLING
= special case of Metropolis-Hastings when all proposed candidates are accepted

Idea: update each coordinate of z at each iteration, by sampling from the posterior conditionals p(z(i)| z(i-1))

Key Equations

[METHOD] How to identify posterior conditionals?

Write the log joint density
Identify all quantities that depend on z_i
Identify the posterior conditionals based on the functional form of z_i

Application: Change Point Detection

Goal: find the changepoint c i.e. the index of mechanism disruption along observations

Our model:

→ 3 hyperparameters: c, λ1, λ2

ABOUT THE PRIORS

Here: α = β = 1

ABOUT THE LIKELIHOODS

Recall about Poisson and Gamma distributions

ABOUT THE POSTERIOR

We cannot compute the posterior in closed-form because of the distributions of the three parameters → GIBBS SAMPLING

SETTING UP OF A GIBBS SAMPLER

Let's compute the 3 posterior conditional distributions:

Step 1: write the log joint density

Step 2: identify all quantities that depend on each parameter

Deriving the posterior conditional distribution for λ1

Step 3: recognize the functional form of known distributions by comparing coefficients

Deriving the posterior conditional distribution for λ1

We follow the same process for the posterior conditional distributions of λ2 and c

→ We cannot match the posterior conditional of c with a known functional form, but we can compute the distribution for each c since it is an index, e.g. an integer from 1 to N.

RUNNING THE GIBBS SAMPLER

Visualizations

Trace and value histogram of the three parameters for N = 4 chains after K = 2000 iterations, with a warm-up phase of 50%

Page updated

Google Sites

Report abuse