A short introduction to Martingales

One way to think about dependent random variables.

Introduction

Martingales are elegant and powerful tools to study sequences of dependent random variables. It is originated from gambling, where a gambler can adjust the bet according to the previous results.

Martingales are one of the more abstract parts of probability. Which is a pity, since they are also one of the most useful bits of probability. Martingales are one of the simplest models of dependent random variables, and that is why they come up in examples over and over again. If you can identify when a stocastic process as a martingale, then you know some rather general things about that process that may lead you to being able to get specific answers to your questions.

If you have a stocastic process whos expected value is time-dependent, then you can potentially convert your process into a martingale. All you need to do is to add the proper conditional expected value. Then you can apply all sorts of martingal theorems (convergence, optional stopping etc) which can lead to explict numerical answers. However approaching martingales with the attitude that you will calculate things will leave you dissapointed. There are very few explicit examples.

You should approach your problem from the point of view that “process XtX_t is a martingale, so any stopping strategy will not affect its expected value”. This realization is useful for an applied as it stops you ways wasting time optimising the expected value.

The maths, unfortunately

We denote our sample space as SS (for example S=+1,1S = {+1, -1}). Let FnF_n be a filtration of the event space.

Let X0,X1,X2,X_0, X_1, X_2, \dots be a sequence of random variables. We will imagine that we are acquiring information about SS in stages. The random variable XnX_n is what we know at stage nn. If ZZ is any random variable, let

E[ZFnn]E\left[Z \mid F_nn\right]

denote the conditional expectation of ZZ given all the information that is available to us on the nnth stage. If you saw all the information you could obtain by stage nn, and you made a Bayesian update to your probability distribution on SS in light of this information, then E[ZFn]E\left[Z\mid F_n\right] would represent the expected value of ZZ with respect to this revised probability.

If we don’t specify otherwise, we assume that the information available at stage n consists precisely of the values X0,X1,X2,X_0, X_1, X_2, \dots so that

E\left[Z \mid F_nn\right] = E\left[Z \mid X_0, X_1, X_2, \dots, X_n].

However in some applications, one could imagine there are other things known as well at stage nn. For example, maybe XnX_n represents the price of an asset on the nnth day and YnY_n represents the price of asset YY on the nnth day. If you have access to the sequence X0,X1,X2,,Xn]X_0, X_1, X_2, \dots, X_n] and the sequence Y0,Y1,Y2,,Yn]Y_0, Y_1, Y_2, \dots, Y_n]. Then E[ZFnn]E\left[Z \mid F_nn\right] would be our revised expectation of ZZ after we have incorporated what we know about both sequences.

We say that sequence XnX_n is a martingale if:

  1. E[Xn]<E\left[\mid X_n \mid\right] < \infty for all nn.
  2. $E\left[X_{n+1} F_n\right] = X_nforall for all n$.

Informally X0,X1,X_0, X_1, \ldots is a martingale if the following is true: taking into account all the information I have at stage nn, the conditional expected value of Xn+1X_{n+1} is just XnX_n. Basically, if your process is a martingale, you can’t know anything about the future other that everything you know in the present moment (and the entire history of the process).

To motivate this definition, imagine that XnX_n represents the price of a stock on day nn. In this context, the martingale condition states informally that “The expected value of the stock tomorrow, given all I know today, is the value of the stock today.” After all, if the stock price today were 50 and I expected it to be 60 tomorrow, then I would have an easy way to make money in expectation (buy today, sell tomorrow). But if the public had the same information I had, then other investors would also try to cash in on this by buying the stock today at 50, and people holding the stock would be reluctant to sell for 50. Indeed, we’d expect the price to be quickly bid up to about 60 today.

Example

Let S=+1,1S = {+1, -1} and let X0,X1,X2,X_0, X_1, X_2, \dots be a sequence of random variables taking values in SS. XnX_n is +1 with probability 0.5 and -1 with probability 0.5 We define:

Mn=i=0nXnM_n = \sum_{i=0}^n X_n

MnM_n is a martingale since:

E[Mn+1Fn]=E[Mn+Xn+1Fn]=E[MnFn]+E[Xn+1Fn]E\left[M_{n+1} \mid F_n\right] = E\left[M_{n} + X_{n+1} \mid F_n\right] = E\left[M_{n}\mid F_n\right] + E\left[X_{n+1} \mid F_n\right]

Since MnM_n is known at stage n, we have E[MnFn]=Mn E\left[M_{n}\mid F_n\right] = M_n. Since we know nothing more about Xn+1X_{n+1} at stage nn than we originally knew, we have E[Xn+1Fn]=0E\left[X_{n+1} \mid F_n\right] =0. So

E[Mn+1Fn]=MnE\left[M_{n+1} \mid F_n\right] = M_n

So the sequence MnM_n is a martingale.

Stopping times, and the Optional Stopping theorem

A stopping time is a (non-negative integer-valued) random variable, TT, such that for all nn the event that T=nT = n depends only on the information available to us at time nn. Stopping times can be interprete as the time that you sell an asset, given that the sequence of prices is X0,X1,X2,X_0, X_1, X_2, \dots

Saying that TT is a stopping time means that the decision to sell at time nn depends only the information we have up to time nn, and not on future prices. Specifying a stopping time can be interpreted as specifying a strategy for deciding when to sell the asset.

For example, let X0,X1,X2,X_0, X_1, X_2, \dots be i.i.d. random variables equal to −1 with probability .5 and 1 with probability .5 and let X0=0X_0 = 0 and Mn=i=0nXnM_n = \sum_{i=0}^n X_n for n0n ≥ 0. These four statements:

  1. The smallest TT for which XT=50\mid X_T \mid = 50
  2. The smallest TT for which XtT30,100X_tT \in {−30, 100}
  3. The smallest TT for which XT=17X_T = 17.
  4. The TT at which the XnX_n sequence achieves the value 17 for the 9th time.

all define stopping times.

Optional Stopping Theorem

The optional stopping theorem says that the expected value of a martingale at a stopping time is equal to its initial expected value. It tells us that you can’t make money (in expectation) by buying and selling an asset whose price is a martingale. Precisely, if you buy the asset at some time and adopt any strategy at all for deciding when to sell it, then the expected price at te time you sell is the price you originally paid. In other words, if the market price is a martingale, you cannot make money in expectation by “timing the market.”

Definition Boundedness. We say a random sequence X0,X1,X2,X_0, X_1, X_2, \dots is bounded when there exists some C>0C > 0, we have that with probability one XnC\mid X_n \mid ≤ C for all n0n \geq 0. A stopping time TT is bounded if there exists some C>0C > 0 such that TCT \geq C with probability one.

Optional Stopping Theorem (first version): Suppose that X_ is a known constant, that X0,X1,X2,X_0, X_1, X_2, \dots is a bounded martingale, and that TT is a stopping time. Then E[XT]=X0E\left[X_T\right] = X_0.

Optional Stopping Theorem (second version): Suppose that X_ is a known constant, that X0,X1,X2,X_0, X_1, X_2, \dots is a martingale, and that TT is a bounded stopping time. Then E[XT]=X0E\left[X_T\right] = X_0.

These boundedness assumptions are actually very important. Without them the theorem would not be true.

For a counterexample, recall that if X0=0X_0 = 0 and XnX_n goes up or down by 1 at each time step (each with probability .5) then X0,X1,X2,X_0, X_1, X_2, \dots is a martingale. If we let TT be the first nn for which Xn=100X_n = 100, then TT is a finite number with probability one. (That is, with probability one XnX_n reaches T eventually.) But then XTX_T is always 100, which means that E[XT]X0E\left[X_T\right] \neq X_0.

Not all Martingales and Markov Chains

\text{Markov chains have a finite memory, Martingales can have an infinite one.}

\text{Pick a random value for }X_0\text{. Let the sequence of random variables }{\epsilon_n,n>0}\text{ be i.i.d. with mean}=E[\epsilon_n]=0\text{ and independent of }X_0\text{.}

\text{The process governed by }X_{n+1}=X_n+\epsilon_{n+1}X_0

\text{ is a martingale as} E[X_{n+1}|X_0,\ldots,X_n]=E[X_n|X_0,\ldots,X_n]+E[\epsilon_{n+1}X_0|X_0,\ldots,X_n]=X_n+E[\epsilon_{n+1}|X_0,\ldots,X_n]E[X_0|X_0,\ldots,X_n]=X_n+E[\epsilon_{n+1}]X_0=X_n+0X_0=X_n

\text{Here, it is key that }\epsilon_n\text{ is independent of the }{X_i,0\leq i\leq n}\text{.}

\text{is not Markov as it is clear that }\text{Pr}[X_n+\epsilon_{n+1}X_0 X_n]\neq\text{Pr}[X_n+\epsilon_{n+1}X_0 X_0,\ldots,X_n]\text{. To determine }X_{n+1}\text{ not only the value of }X_n\text{ but the entire path to it (at which value did the path start at }X_0\text{) is needed.}