\documentclass{article}
\usepackage{ifthen}
\begin{document}
\newcounter{OldSection}
\newcounter{ParCount}
\newcommand{\para}{
\vspace{.4cm}
\ifthenelse { \value{OldSection} < \value{section} }
{ \setcounter{OldSection}{ \value{section} }
\setcounter{ParCount}{ 0 } }
{}
\stepcounter{ParCount}
\noindent
\bf \arabic{section}.\arabic{ParCount}. \rm \hspace{.2cm}
}
\Large \begin{center}
Stochastic Calculus Notes, Lecture 3 \\
\normalsize
Last modified \today
\end{center} \normalsize
\section{Martingales and stopping times}
\para Introduction:
Martingales and stopping times are inportant technical tools used in the
study of stochastic processes such as Markov chains and diffusions.
A {\em martingale} is a stochastic process that is always unpredictable in
the sense that $E[F_{t+t^{\prime}} \mid {\cal F}_t] = F_t$ (see below)
if $t^{\prime}> 0$.
A {\em stopping time} is a random ``time'', $\tau(\omega)$, so that
we know at time $t$ whether to stop, i.e.\ the event
$\left\{\tau \leq t\right\}$ is measurable in ${\cal F}_t$.
These tools work well together because stopping a martingale at a
stopping also has mean zero: if $t \leq \tau \leq t^{\prime}$, then
$E\left[F_{\tau} \mid {\cal F}_t\right] = F_t$.
A central fact about the Ito calculus is that Ito integrals with respect
to Brownian motion are martingales.
\para Stochstic process:
Here is a more abstract definition of a discrete time
{\em stochastic processes}.
We have a probability space, $\Omega$.
The information available at time $t$ is represented by the algebra of
events ${\cal F}_t$.
We assume that for each $t$, ${\cal F}_t \subset {\cal F}_{t+1}$;
since we are supposed to gain information going from $t$ to $t+1$,
every known event in ${\cal F}_t$ is also known at time $t+1$.
A stochastic process is a family of random variables, $X_t(\omega)$,
with $X_t \in {\cal F}_t$ ($X_t$ measureable with respect to
${\cal F}_t$).
Sometimes it happens that the random variables $X_t$
contain all the information in the ${\cal F}_t$ in the sense that
${\cal F}_t$ is generated by $X_1$, $\ldots$, $X_t$.
This the {\em minimal algebra} in which the $X_t$ form a stochastic process.
In other cases ${\cal F}_t$ contains more information.
Economists use these possibilities when they distinguish between the
``weak efficient market hypothesis'' (the ${\cal F}_t$ are minimal),
and the ``strong hypothesis'' (${\cal F}_t$ contains all the public information
in the world, literally).
In the case of minimal ${\cal F}_t$, it may be possible to identify the
outcome, $\omega$, with the path $X = X_1,\ldots,X_T$.
This is less common when the ${\cal F}_t$ are not minimal because the
extra information may have to do with processes other than $X_t$.
For the definition of stochastic process, the probabilities are not important,
just the algebras of sets and ``random variables'' $X_t$.
An expanding family of $\sigma-$algebras
${\cal F}_t \subseteq {\cal F}_{t+1}$ is a {\em filtration}.
\para Notation:
The value of a stochastic process at time $t$ may be written $X_t$ or $X(t)$.
The subscript notation reminds us that the $X_t$ are a family of functions
of the random outcome (random variable) $\omega$.
In practical contexts, particularly in discussing multidimensional processes
($X(t) \in R^n$), we prefer $X(t)$ so that $X_k(t)$ can represent the
$k^{\mbox{th}}$ component of $X(t)$.
When the process is a martingale, we often call it $F_t$.
This will allow us to let $X(t)$ be a Markov chain and $F_t(X)$
a martingale function of $X$.
\para Example 1, Markov chains:
In this example, the ${\cal F}_t$ are minimal and $\Omega$ is the
path space of sequences of length $T$ from the state space, $\cal S$.
The new information revealed at time $t$ is the state of the chain
at time $t$.
The variables $X_t$ are may be called ``coordinate functions'' because
$X_t$ is coordinate $t$ (or entry $t$) in the sequence $X$. In
principle, we could express this with the notation $X_t(X)$, but
that would drive people crazy. Although
we distinguish between Markov chains (discrete time) and Markov
processes (continuous time), the term ``stochastic process'' can
refer to either continuous or discrete time.
\para Example 2, diadic sets:
This is a set of definitions for discussing averages over a range of
length scales.
The ``time'' variable, $t$, represents the amount of averaging that has
been done.
The new information revealed at time $t$ is finer scale information
about a function (an audio signal or digital image).
The state space is the positive integers from $1$ to $2^T$.
We start with a function $X(\omega)$ and ask that $X_t(\omega)$ be constant
on {\em diadic blocks} of length $2^{T-t}$.
The diadic blocks at level $t$ are
\begin{equation}
B_{t,k} = \left\{1+(k-1)2^{T-t}, 2+(k-1)2^{T-t}, \ldots, k2^{T-t}\right\} \; .
\label{DB} \end{equation}
The reader should check that moving from level $t$ to level $t+1$
splits each block into right and left halves:
\begin{equation}
B_{t,k} = B_{t+1,2k-1} \cup B_{t+1,2k} \;.
\label{twoB} \end{equation}
The algebras ${\cal F}_t$ are generated by the block partitions
$$
{\cal P}_t = \left\{B_{t,k}\mbox{ with }k=1,\ldots,2^{T-t}\right\} \;.
$$
Because ${\cal F}_t \subset {\cal F}_{t+1}$, the ${\cal P}_{t+1}$
is a refinement of ${\cal P}_t$.
The union (\ref{twoB}) shows how.
We will return to this example after discussing martingales.
\para Martingales:
A real valued stochastic process, $F_t$, is a
martingale\footnote{For finite $\Omega$ this is the whole story. For
countable $\Omega$ we also assume that the sums defining $E[X_t]$
converge absolutely. This means that $E[\left|X_t\right|] < \infty$.
That implies that the conditional expectations $E[X_t+1 \mid {\cal F}_t$
are well defined.}
if
$$
E[F_{t+1} \mid {\cal F}_t] = F_t \; .
$$
If we take the overall expectation of both sides we see that
the expectation value does not depend on $t$, $E[F_{t+1}]=E[F_t]$.
The martingale property says more. Whatever information you might
have at time $t$ notwithstanding, still the expectation of future
values is the present value.
There is a gambling interpretation: $F_t$ is the amount of money you
have at time $t$. No matter what has happened, your expected winnings
at between $t$ and $t+1$, the ``martingale difference''
$Y_{t+1} = F_{t+1}-F_t$, has zero expected value. You can also think
of martingale differences as a generalization of independent random
variables. If the random variables $Y_t$ were actually independent,
then the sums $F_t = \sum_{k=1}^t Y_t$ would form a martingale (using
the ${\cal F}_t$, generated by the $Y_1$, $\ldots$, $Y_t$). The reader
should check this.
\para Examples:
The simplest way to get a martingale is to start with a random variable,
$F(\omega)$, and define $F_t = E[F\mid {\cal F}_t]$.
If we apply this to a Markov chain with the minimal filtration ${\cal F}_t$,
and $F$ is a final time reward $F = V(X(T))$, then $F_t = f(X(t),t)$ as
in the previous lecture.
If we apply this to $\Omega = \left\{1,2,\ldots,2^T\right\}$,
with uniform probability $P(k) = 2^{-T}$ for $k\in\Omega$, and the
diadic filtration, we get the diadic martingale with $F_t(j)$ constant on the
diadic blocks (\ref{DB}) and equal to the average of $F$ over the block
$j$ is in.
\para A lemma on conditional expectation:
In working with martingales we often make use of a basic lemma about
conditional expectation. Suppose $U(\omega)$ and $Y(\omega)$ are real
valued random variables and that $U\in \cal F$.
Then
\begin{equation}
E[UY\mid{\cal F}] = UE[Y\mid{\cal F}] \; .
\label{UY} \end{equation}
We see this using classical conditional expectation over the sets in
the partition defining $\cal F$.
Let $B$ be one of these sets.
Let $y_B=E[Y \mid \omega \in B]$ be the value of $E[Y\mid {\cal F}]$
for $\omega \in B$.
We know that $U(\omega)$ is constant in $B$ because $U\in \cal F$.
Call this value $u_B$.
Then $E[UY\mid B] = u_BE[Y\mid B] = u_b y_b$.
But this is the value of $UE[Y\mid {\cal F}]$ for $\omega\in B$.
Since each $\omega$ is in some $B$, this proves (\ref{UY}) for all $\omega$.
\para Doob's principle:
This lemma lets us make new martingales from old ones.
Let $F_t$ be a martingale and $Y_t = F_t - F_{t-1}$ the
{\em martingale differences} (called {\em innovations} by statisticians
and {\em returns} in finance).
We use the convention that $F_{-1} = 0$ so that $F_0 = Y_0$.
The martingale condition is that $E[Y_{t+1} \mid {\cal F}_t] = 0$.
Clearly $F_t = \sum_{t^{\prime} = 0}^t Y_{t^{\prime}}$.
Suppose that at time $t$ we are allowed to place a bet of any
size\footnote{We may have to require that the bet have finite
expected value.}
on the as yet unknown martingale difference, $Y_{t+1}$.
Let $U_t \in {\cal F}_t$ be the size of the bet.
The return from betting on $Y_t$ will be $U_{t-1}Y_t$,
and the total accumulated return up to time $t$ is
\begin{equation}
G_t = U_0 Y_1 + U_1 Y_2 + \cdots + U_{t-1}Y_t \; .
\label{DM} \end{equation}
Because of the lemma (\ref{UY}), the betting returns have
$E[U_{t}Y_{t+1}\mid {\cal F}_t]=0$, so $E[G_{t+1}\mid {\cal F}_t] = G_t$
and $G_t$ also is a martingale.
The fact that $G_t$ in (\ref{DM}) is a martingale sometimes is called
{\em Doob's principle} or {\em Doob's theorem} after the probabilist
who formulated it.
A special case below for stopping times is {\em Doob's stopping time theorem}
or the {\em optional stopping theorem}.
They all say that strategizing on a martingale never produces anything
but a martingale.
Nonanticipating strategies on martingales do not give positive expected
returns.
\para Weak and strong efficient market hypotheses:
It is possible that the random variables $F_t$ form a martingale with
respect to their minimal filtration, ${\cal F}_t$, but not with respect
to an enriched filtration ${\cal G}_t \supset {\cal F}_t$.
The simplest example would be the algebras ${\cal G}_t = {\cal F}_{t+1}$,
which already know the value of $F_{t+1}$ at time $t$.
Note that the $F_t$ also are a stochastic process with respect to the
${\cal G}_t$.
The {\em weak efficient market hypothesis} is that $e^{-\mu t}S_t$
is a martingale ($S_t$ being the stock price and $\mu$ its expected
growth rate) with respect to its minimal filtration.
{\em Technical analysis} means using trading strategies that are
nonanticipating with respect to the minimal filtration.
Therefore, the weak efficient market hypothesis says that technical
trading does not produce better returns than buy and hold.
Any extra information you might get by examining the price history
of $S$ up to time $t$ is already known by enough people that it is
already reflected in the price $S_t$.
The {\em strong} efficient market hypothesis states that $e^{-\mu t}S_t$
is a martingale with respect to the filtration, ${\cal G}_t$, representing
all the public information in the world.
This includes the previous price history of $S$ and much more (prices of
related stocks, corporate reports, market trends, etc.).
\para Investing with Doob:
Economists sometimes use Doob's principle and the efficient market hypotheses
to make a point about active trading in the stock market.
Suppose that $F_t$, the price of a stock at time $t$, is a
martingale\footnote{This is a reasonable approximation for much
short term trading}.
Suppose that at time $t$ we all the information in ${\cal F}_t$, and
choose an amount, $U_t$, to invest at time $t$.
The fact that the resulting accumulated, $G_t$, has zero expected value
is said to show that active investing is no better than a
``buy and hold'' strategy that just produces the value $F_t$.
The well known book {\bf A Random Walk on Wall Street} is mostly an
exposition of this point of view.
This argument breaks down when applied to non martingale processes, such
as stock prices over longer times.
Active trading strategies such as (\ref{DM}) may produce reduce the
risk more than enough to compensage risk averse investors for small
amounts of lost expected value.
Merton's optimal dynamic investment analysis is a simple example of
an active trading strategy that is better for some people than
passive buy and hold.
\para Stopping times:
We have $\Omega$ and the expanding family ${\cal F}_t$. A stopping time
is a function $\tau(\omega)$ that is one of the times $1$, $\ldots$, $T$,
so that the event $\{\tau \leq t\}$ is in ${\cal F}_t$. Stopping times
might be thought of as possible strategies. Whatever your criterion for
stopping is, you have enough information at time $t$ to know whether you
should stop at time $t$. Many stopping times are expressed as the first
time something happens, such as the first time $X_t > a$. We cannot ask
to stop, for example, at the last $t$ with $X_t>a$ because we might not
know at time $t$ whether $X_{t^{\prime}}>a$ for some $t^{\prime}>t$.
\para Doob's stopping time theorem for one stopping time:
Because stopping times are nonanticipating strategies, they also cannot
make money from a martingale. One version of this statement is that
$E[X_{\tau}] = E[X_1]$. The proof of this makes use of the events
$B_t$, that $\tau = t$. The stopping time hypothesis is that
$B_t \in {\cal F}_t$. Since $\tau$ has some value $1\leq \tau \leq T$,
the $B_t$ form a partition of $\Omega$. Also, if $\omega \in B_t$,
$\tau(\omega) = t$, so $X_{\tau} = X_t$. Therefore,
\begin{eqnarray*}
E[X_1] & = & E[X_T] \\
& = & \sum_{t=1}^T E[X_T \mid B_t ] P(B_t) \\
& = & \sum_{t=1}^T E[X_{\tau}] P(\tau = t) \\
& = & E[X_{\tau}] \; .
\end{eqnarray*}
In this derivation we made use of the classical statement of the martingale
property, if $B\in {\cal F_t}$ then $E[X_T\mid B] = E[X_t \mid B]$. In our
$B=B_t$, $X_t = X_{\tau}$.
This simple idea, using the martingale property applied to the partition
$B_t$, is crucial for much of the theory of martingales. The idea itself
was first used Kolmogorov in the context of random walk or Brownian motion.
Doob realized that Kolmogorov's was even simpler and more beautiful when
applied to martingales.
\para Stopping time paradox:
The technical hypotheses above, finite state space, bounded stopping times,
may be too strong, but they cannont be completely ignored, as this famous
example shows. Let $X_t$ be a symmetric random walk starting at zero.
This forms a martingale, so $E[X_{\tau}] = 0$ for any stopping time, $\tau$.
On the other hand, suppose we take $\tau = \min(t\mid X_t =1)$. Then
$X_{\tau}=1$ always, so $E[X_{\tau}]=1$. The catch is that there is no
$T$ with $\tau(\omega) \leq T$ for all $\omega$. Even though
$\tau < \infty$ ``almost surely'' (more to come on that expression),
$E[\tau] = \infty$ (explination later). Even that would be OK if the
possible values of $X_t$ were bounded. Suppose you choose $T$
and set $\tau^{\prime} = min(\tau,T)$. That is, you wait until $X_t=1$
or $t=T$, whichever comes first, to stop. For large $T$, it is very
likely that you stopped for $X_t=1$. Sill, those paths that never reached
1 probably drifted just far enough in the negative direction so that their
contribution to the overall expected value cancels the 1 to yield
$E[X_{\tau^{\prime}}] = 0$.
\para More stopping times theorem:
Suppose we have an increasing family of stopping times,
$1 \leq \tau_1 \leq \tau_2 \cdots$. In a natural way the random variables
$Y_1 = X_{\tau_1}$, $Y_2 = X_{\tau_2}$, etc.\ also form a martingale.
This is a final elaborate way of saying that strategizing on a martingale
is a no win game.
\end{document}