\documentclass{article}
\usepackage{ifthen}
\usepackage{graphicx}
\begin{document}
\newcounter{OldSection}
\newcounter{ParCount}
\newcommand{\para}{
\vspace{.4cm}
\ifthenelse { \value{OldSection} < \value{section} }
{ \setcounter{OldSection}{ \value{section} }
\setcounter{ParCount}{ 0 } }
{}
\stepcounter{ParCount}
\noindent
\bf \arabic{section}.\arabic{ParCount}. \rm \hspace{.2cm}
}
\Large \begin{center}
Stochastic Calculus Notes, Lecture 5 \\
\normalsize
Last modified \today
\end{center} \normalsize
\section{Integrals involving Brownian motion}
\para Introduction:
There are two kinds of integrals involving Brownian motion,
{\em time integrals} and {\em Ito integrals}.
The time integral, which is discussed here, is just the ordinary Riemann
integral of a continuous but random function of $t$ with respect to $t$.
Such integrals define stochastic processes that satisfy interesting
backward equations.
On the one hand, this allows us to compute the expected value of the
integral by solving a partial differential equation.
On the other hand, we may find the solution of the partial differential
equation by computing the expected value by Monte Carlo, for example.
The {\em Feynman Kac} formula is one of the examples in this section.
\para The integral of Brownian motion:
Consider the random variable, where $X(t)$ continues to be standard
Brownian motion,
\begin{equation}
Y = \int_0^T X(t) dt \; .
\label{Yint} \end{equation}
We expect $Y$ to be Gaussian because the integral is a linear functional
of the (Gaussian) Brownian motion path $X$.
Because $X(t)$ is a continuous function of $t$, this is a standard
Riemann integral.
The Riemann sum approximations converge.
As usual, for $n>0$ we define $\Delta t = T/n$ and $t_k = k\Delta t$.
The Riemann sum approximation is
\begin{equation}
Y_n = \Delta t \sum_{k=0}^{n-1} X(t_k) \; ,
\label{Yn} \end{equation}
and $Y_n \rightarrow Y$ as $n \rightarrow \infty$ because $X(t)$ is a
continuous function of $t$.
The $n$ summands in (\ref{Yn}), $X(t_k)$, form an $n$
dimensional multivariate normal, so each of the $Y_n$ is normal.
It would be surprising if $Y$, as the limit of Gaussians, were not Gaussian.
\para The variance of $Y$:
We will start the hard way, computing the variance from (\ref{Yn})
and letting $\Delta t \rightarrow 0$.
The trick is to use two summation variables
$Y_n = \Delta t \sum_{k=0}^{n-1} X(t_k)$ and
$Y_n = \Delta t \sum_{j=0}^{n-1} X(t_j)$.
It is immediate from (\ref{Yn}) that $E[Y_n] = 0$ and
$\mbox{var}(Y_n) = E[Y_n^2]$:
\begin{eqnarray*}
E[Y_n^2 ] & = & E[Y_n \cdot Y_n] \\
& = & E\left[ \left( \Delta t \sum_{k=0}^{n-1} X(t_k)\right) \cdot
\left( \Delta t \sum_{j=0}^{n-1} X(t_j)\right) \right] \\
& = & \Delta t^2 \sum_{jk} E[X(t_k)X(t_j)] \; .
\end{eqnarray*}
If we now let $\Delta t \rightarrow 0$, the left side converges to
$E[Y^2]$ and the right side converges to a double integral:
\begin{equation}
E[Y^2] = \int_{s=0}^T \int_{t=0}^T E[X(t) X(s)] dsdt \; .
\label{varInt} \end{equation}
We can find the needed $E[X(t) X(s)]$ if $s > t$ by writing
$X(s) = X(t) + \Delta X$ with $\Delta X$ independent of $X(t)$, so
\begin{eqnarray*}
E[X(t)X(s) & = & E[X(t) (X(t) + \Delta X)] \\
& = & E[X(t) X(t)] \\
& = & t \; .
\end{eqnarray*}
A variation of this argument gives $E[X_t X_s] = s$ if $st_1$ and, for
$t>t_1$, write $X(t) = X(t_1) + \Delta X(t)$.
Then
\begin{eqnarray*}
E\left[\int_0^{t_2} X(t) ds \mid {\cal F}_{t_1} \right]
& = &
E\left[\left( \int_0^{t_1} X(t)dt +
\int_{t_1}^{t_2}X(t) dt \right) \Bigm| {\cal F}_{t_1} \right] \\
& = &
E\left[\int_0^t X(t) dt \mid {\cal F}_t \right] \ +
E\left[ \int_{t_1}^{t_2}
\left( X(t_1) + \Delta X(t) \right) dt \mid{\cal F}_t\right] \\
& = &
\int_0^t X(t) dt + ( t_2 - t_1 ) X(t_1) \; .
\end{eqnarray*}
In the last line we use the facts that $X(t) \in {\cal F}_{t_1}$ when
$t < t_1$, and $X_{t_1} \in {\cal F}_{t_1}$,
and that $E[\Delta X(t) \mid {\cal F}_{t_1} ] = 0$ when $t > t_1$,
which is part of the independent increments property.
For the $X(t)^3$ part, we have,
\begin{eqnarray*}
\lefteqn{ E\left[ \left( X(t_1) + \Delta X(t_2) \right)^3
\mid {\cal F}_{t_1} \right] } \\
& &
= E\left[ X(t_1)^3
+ 3 X(t_1)^2 \Delta X(t_2)
+ 3 X(t_1) \Delta X(t_2)^2
+ \Delta X(t_2)^3 \mid {\cal F}_{t_1} \right]\\
& &
=X(t_1)^3 + 3X(t_1)^2 \cdot 0 + 3X(t_1) E[\Delta X(t_2)^2 \mid {\cal F}_{t_1} ]
+ 0 \\
& &
=X(t_1)^3 + 3(t_2 - t_1) X(t_1) \; .
\end{eqnarray*}
In the last line we used the independent increments property to get
$E[\Delta X(t_2) \mid {\cal F}_{t_1}] = 0$, and the formula for
the variance of the increment to get
$E[\Delta X(t_2)^2 \mid {\cal F}_{t_1}] = t_2 - t_1$.
This verifies that $E[F(t_2) \mid {\cal F}_t] = F(t_1)$, which is
the martingale property.
\para Backward equations for expected values of integrals:
Many integrals involving Brownian motion arise in applications
and may be ``solved'' using backward equations.
One example is $F=\int_0^T V(X(t)) dt$, which represents the total
accumulated $V(X)$ over a Brownian motion path.
If $V(x)$ is a continuous function of $x$, the integral is a standard
Riemann integral, because $V(X(t))$ is a continuous function of $t$.
We can calculate $E[F]$, using the more general function
\begin{equation}
f(x,t) = E_{x,t}\left[\int_t^T V(X(s)) ds \right] \; .
\label{fRep} \end{equation}
As before, we can describe the function $f(x,t)$ in terms of the
random variable
$$
F(t) = E\left[\int_t^T V(X(s)) dt \mid {\cal F}_t \right] \; .
$$
Since $F(t)$ is measurable in ${\cal F}_t$ and depends only on future
values ($X(s)$ with $s>t$), $F(t)$ is measurable in ${\cal G}_t$.
Since ${\cal G}_t$ is generated by $X(t)$ alone, this means that
$F(t)$ is a function of $X(t)$, which we write as $F(t) = f(X(t),t)$.
Of course, this definition is a big restatement of definition (\ref{fRep}).
Once we know $f(x,t)$, we can plug in $t=0$ to get
$E[F] = F(0) = f(x_0,0)$ if $X(0) = x_0$ is known.
Otherwise, $E[F] = E[f(X(0),t)]$.
The backward equation for $f$ is
\begin{equation}
\partial_t f + \frac{1}{2} \partial_x^2 f + V(x,t) = 0 \; ,
\label{fV} \end{equation}
with final conditions $f(x,T) = 0$.
The derivation is similar to the
one we used before for the backward equation for $E_{x,t}[V(X_T)]$.
We use Taylor series and the tower property to calculate how $f$
changes over a small time increment, $\Delta t$.
We start with
$$
\int_t^T V(X(s)) ds = \int_t^{t+\Delta t} V(X(s)) ds
+ \int_{t+\Delta t}^T V(X(s)) ds \; ,
$$
take the $x,t$ expectation, and use (\ref{fRep}) to get
\begin{equation}
f(x,t) = E_{x,t}\left[\int_t^{t+\Delta t} V(X(s)) ds \Bigm| {\cal F}_t\right]
+ E_{x,t}\left[\int_{t+\Delta t}^T V(X(s)) ds \Bigm| {\cal F}_t\right] \; .
\label{fSep} \end{equation}
The first integral on the right has the value $V(x)\Delta t + o(\Delta t)$.
We write $o(\Delta t)$ for a quantity that is smaller than $\Delta t$
in the sense that $o(\Delta t)/\Delta t \rightarrow 0$ as
$\Delta t \rightarrow 0$ (we will shortly divide by $\Delta t$,
take the limit $\Delta t \rightarrow 0$, and neglect all $o(\Delta t)$
terms.). The second term has
$$
E_{x,t}\left[ \int_{t+\Delta t}^T V(X(s)) ds \mid {\cal F}_{t + \Delta t} \right]
= F(X_{t+\Delta t}) = f(X(t+\Delta t), t+\Delta t) \; .
$$
Writing $X(t+\Delta t) = X(t) + \Delta X$, we use the tower property with
${\cal F}_t \subset {\cal F}_{t+\Delta t}$ to get
$$
E\left[ \int_{t+\Delta t}^T V(X(s)) ds \mid {\cal F}_t \right]
= E\left[ f(X_t + \Delta X, t+\Delta t) \mid {\cal F}_t \right] \; .
$$
As before, we use Taylor expansions the conditional expectation to get first
$$
f(x+\Delta X, t+\Delta t) = f(x,t) + \Delta t \partial_t f(x,t)
+ \Delta X \partial_x f(x,t)
+ \frac{1}{2}\Delta X^2 \partial_x^2 f(x,t) + o(\Delta t) \; ,
$$
then
$$
E_{x,t}\left[ f(x+\Delta X, t + \Delta t \right] =
f(x,t) + \Delta t \partial_t f(x,t) +
\frac{1}{2}\Delta t \partial_x^2 f(x,t) + o(\Delta t) \; .
$$
Putting all this back into (\ref{fSep}) gives
$$
f(x,t) = \Delta t V(x) + f(x,t) + \Delta t \partial_t f(x,t) +
\frac{1}{2}\Delta t \partial_x^2 f(x,t) + o(\Delta t) \; .
$$
Now just cancel $f(x,t)$ from both sides and let $\Delta t \rightarrow 0$
to get the promised equation (\ref{fV}).
\para Application of PDE:
Most commonly, we cannot evaluate either the expected value (\ref{fRep})
or the solution of the partial differential equation (PDE) (\ref{fV}).
How does the PDE represent progress toward evaluating $f$? One way is
by suggesting a completely different computational procedure. If we
work only from the definition (\ref{fRep}), we would use Monte Carlo
for numerical evaluation. Monte Carlo is notoriously slow and inaccurate.
There are several techniques for finding the solution of a PDE that
avoid Monte Carlo, including finite difference methods, finite element
methods, spectral methods, and trees. When such deterministic methods
are practical, they generally are more reliable, more accurate, and faster.
In financial applications, we are often able to find PDEs for quantities
that have no simple Monte Carlo probabilistic definition. Many such
examples are related to optimization problems: maximizing an expected
return or minimizing uncertainty with dynamic trading strategies in a
randomly evolving market. The Black Scholes evaluation of the value of
an American style option is a well known example.
\para The Feynman Kac formula:
Consider
\begin{equation}
F = E\left[ \exp\left( \int_0^T V(X(t)dt\right)\right] \; .
\label{FFK} \end{equation}
As before, we evaluate $F$ using the related and more refined quantities
\begin{equation}
f(x,t) = E_{x,t}\left[ e^{\int_t^T V(X_s)ds} \right]
\label{FKf} \end{equation}
satisfies the backward equation
\begin{equation}
\partial_t f+ \frac{1}{2} \partial_x^2 f + V(x) f = 0 \; .
\label{FKe} \end{equation}
When someone refers to the {\em Feynman Kac formula}, they usually are
referring to the fact that (\ref{FKf}) is a formula for the solution of
the PDE (\ref{FKe}). In our work, the situation mostly will be reversed.
We use the PDE (\ref{FKe}) to get information about the quantity defined
by (\ref{FKf}) or even just about the process $X(t)$.
We can verify that (\ref{FKf}) satisfies (\ref{FKe}) more or less as in the
preceding paragraph. We note that
\begin{eqnarray*}
\lefteqn{
\exp\left\{\int_t^{t+\Delta t} V(X(s)) ds+
\int_{t+\Delta t}^T V(X(s))ds \right\}
} \\
& & =
\exp\left\{\int_t^{t+\Delta t} V(X(s)) ds\right\} \cdot
\exp \left\{ \int_{t+\Delta t}^T V(X(s))ds \right\} \\
& & =
\left( 1 + \Delta t V(X(t)) + o(\Delta t) \right) \cdot
\exp \left\{ \int_{t+\Delta t}^T V(X(s))ds \right\}
\end{eqnarray*}
The expectation of the rigth side with respect to
${\cal F}_{t+\Delta t}$ is
$$
\left( 1+\Delta t V(X_t) + o(\Delta t) \right) \cdot
f(X(t + \Delta X, t+\Delta t) \; .
$$
When we now take expectation with respect to ${\cal F}_t$, which amounts
to averaging over $\Delta X$, using Taylor expansion of $f$ about $f(x,t)$
as before, we get (\ref{FKe}).
\para The Feynman integral:
A precurser to the Feynman Kac formula, is the {\em Feynman
integral}\footnote{The American Physicist Richard Feynman was born and
raised in Far Rockaway (a neighborhood of Queens, New York).
He is the author of several wonderful popular books, including
{\em Surely You're Joking, Mr.\ Feynman} and {\em The Feynman Lectures on
Physics}.}
solution to the Schr\"odinger equation.
The Feynman integral is not an integral in the sense of measure theory.
(Neither is the Ito integral, for that matter.)
The colorful probabilist Marc Kac (pronounced ``Katz'') discovered that an
actual integral over Wiener measure (\ref{FKf}) gives the solution of
(\ref{FKe}). Feynman's reasoning will help us derive the Girsanov formula,
so we pause to sketch it.
The finite difference approximation
\begin{equation}
\int_0^T V(X(t))dt \approx \Delta t \sum_{k=0}^{n-1} V(X(t_k)) \; ,
\label{Vdt} \end{equation}
(always $\Delta t = T/n$, $t_k = k\Delta t$) leads to an approximation to
$F$ of the form
\begin{equation}
F_n = E\left[ \exp \left( \Delta t \sum_{k=0}^{n=1} V(X(t_k))\right) \right] \; .
\label{Fn} \end{equation}
The functional $F_n$ depends only on finitely many values $X_k = X(t_k)$,
so we may evaluate (\ref{Fn}) using teh known joint density function for
$\vec{X} = (X_1,\ldots,X_n)$.
The density is (see ``Path probabilities'' from Lecture 5):
$$
U^{(n)}(\vec{x}) = \frac{1}{(2\pi\Delta t^{n/2}}
\exp\left( -\sum_{k=0}^{n-1} (x_{k+1}-x_k)^2/e\Delta t \right) \; .
$$
It is suggestive to rewrite this as
\begin{equation}
U^{(n)}(\vec{x}) = \frac{1}{(2\pi\Delta t^{n/2}}
\exp\left[ -\frac{\Delta t}{2}\sum_{k=0}^{n-1}
\left( \frac{x_{k+1}-x_k}{\Delta t}\right)^2 \right] \; .
\label{xDen} \end{equation}
Using this to evaluate $F_n$ gives
\begin{equation}
F_n = \frac{1}{(2\pi\Delta t^{n/2}} \int{R^n}
\exp\left[\Delta t \sum_{k=0}^{n-1} V(x_k)
-\frac{\Delta t}{2}\sum_{k=0}^{n-1}
\left( \frac{x_{k+1}-x_k}{\Delta t}\right)^2 \right] d\vec{x} \; .
\label{FnUdx} \end{equation}
It is easy to show that $F_n \to F$ as $n \to \infty$ as long as $V(x)$ is,
say, continuous and bounded (see below).
Feynman proposed a view of $F = \lim_{n \to \infty}F_n$ in (\ref{FnUdx})
that is not mathematically rigorous but explains ``what's going on''.
If $x_k \approx x(t_k)$, then we should have
$$
\Delta t \sum_{k=0}^{n-1} V(x_k) \to \int_{t=0}^T V(x(t))dt \; .
$$
Also,
$$
\left( \frac{x_{k+1}-x_k}{\Delta t}\right) \approx
\frac{dx}{dt} = \dot{x}(t_k) \; ,
$$
so we should also have
$$
\frac{\Delta t}{2}\sum_{k=0}^{n-1}
\left( \frac{x_{k+1}-x_k}{\Delta t}\right)^2 \to
\int_0^T \dot{x}(t)^2 dt \; .
$$
As $n \to \infty$, the integral over $R_n$ should converge to the integral
over all ``paths'' $x(t)$. We denote this by $\cal P$ without worring
about exactly which paths are allowed (continuous, differentiable, ...?).
The integration element $d\vec{x}$ has the possible formal limit
$$
d\vec{x} = \prod_{k=0}^{n-1} dx_k = \prod_{k=0}^{n-1} dx(t_k) \to
\prod_{t=0}^T dx(t) \; .
$$
Altogether, this gives the formal expression for the limit of (\ref{FnUdx}):
\begin{equation}
F = \mbox{\em const} \int_{\cal P} \exp\left( \int_0^T V(x(t))dt -
{\textstyle \frac{1}{2}} \int_0^T \dot{x}(t)^2 dt \right)
\prod_{t=0}^T dx(t) \; .
\label{FPI} \end{equation}
\para Feynman and Wiener integration:
Mathematicians were quick to complain about (\ref{FPI}).
For one thing, the constant
$\mbox{\em const} = \lim_{n \to\infty} (2\pi\Delta t)^{n/2}$
should be infinite.
More seriously, there is no abstract integral measure corresponding
to $\int_{\cal P} \prod_{t=0}^T dx(t)$ (it is possible to prove this).
Kac proposed to write (\ref{FPI}) as
$$
F = \int_{\cal P} \exp\left( \int_0^T V(x(t))dt\right)
\left[\mbox{\em const} \cdot \exp \left( -
{\textstyle \frac{1}{2}} \int_0^T \dot{x}(t)^2 dt \right)
\prod_{t=0}^T dx(t) \right]\; .
$$
and then interpret the latter part as Wiener measure ($dP$):
\begin{equation}
\mbox{\em const} \cdot
\exp \left( -{\textstyle \frac{1}{2}} \int_0^T \dot{x}(t)^2 dt \right)
\prod_{t=0}^T dx(t) = dP(X) \;
\label{FiWm} \end{equation}
In fact, we have already implicitly argued informally (and it can be formalized)
that
$$
\lim_{n \to\infty} U^{(n)}(\vec{x})\prod_{k=0}^{n-1} dx_k \to dP(X)
\;\; \mbox{as} \;\; n \to \infty \; .
$$
These intuitive but mathematically nonsensical formulas are a great help
in understanding Brownian motion.
For one thing, (\ref{FiWm}) makes clear that Wiener measure is Gaussian.
Its density has the form $\mbox{\em const}\cdot \exp(-Q(x))$, where $Q(x)$
is a positive quadratic function of $x$.
Here $Q(x) = \int \dot{x}(t)^2 dt$ (and the constant is, alas, infinite).
Moreover, in many cases it is possible to approximate integrals of the
form $\int \exp(\phi(\vec{x}))d\vec{x}$ by $e^{\phi_*}$, where
$\phi_* = \max_{\vec{x}} \phi(\vec{x})$ if the $\phi$ is sharply
peaked around its maximum.
This is particularly common in ``rare event'' or ``large deviation'' problems.
In our case, this would lead us to solve the {\em calculus of variations}
problem
$$
\max_x \left( \int_0^T V(x(t) dt -
\textstyle{ \frac{1}{2}} \int_0^T \dot{x}(t)^2 dt \right)
\; .
$$
\para Application of Feynman Kac:
The problem of evaluating
$$
f = E\left[ \exp \left( \int_0^T V(X_t) dt \right) \right]
$$
arises in many situations. In finance, $f$ could represent the
present value of a payment in the future subject to unknown fluctuating
interest rates. The PDE (\ref{FKe}) provides a possible way to evaluate
$f = f(0,0)$, either analytically or numerically.
\section{Mathematical formalism}
\para Introduction:
We examine the solution formulas for the backward and forward equation from
two points of view. The first is an analogy with linear algebra,
with {\em function spaces} playing the role of vector space and {\em operators}
playing the role of matrices. The second is a more physical picture,
interpreting $G(x,y,t)$ as the {\em Green's function} describing the
forward diffusion of a point mass of probability or the backward diffusion
of a localized unit of payout.
\para Solution operator
As time moves forward, the probability density for $X_t$ changes, or
{\em evolves}. As time moves backward, the value function $f(x,t)$ also
evolves\footnote{Unlike biological evolustion, this evolution process
makes the solution less complicated, not more.}
The backward evolution process is given by (for $s>0$, this is a consequence
of the tower property.)
\begin{equation}
f(x,t-s) = \int G(x,y,s) f(y,t) dy \; .
\label{BEvFun} \end{equation}
We write this abstractly as $f(t-s) = G(s) f(t)$.
This formula is anologous to the comparable Markov chain formula
$\displaystyle f(t-s) = P^s f(t)$. In the Markov chain case, $s$ and $t$ are
integers and $f(t)$ represents a vector in $R^n$ whose components are $f_k(t)$.
Here, $f(t)$ is a function of $x$ whose values are $f(x,t)$.
We can think of $P^s$ as an $n \times n$ matrix or as the {\em linear operator}
that transforms the vector $f$ to the vector $g=P^sf$.
Similarly, $G(s)$ is a linear operator, transforming a function $f$ into
$g$, with
$$
g(x) = \int_{-\infty}^{\infty} G(x,t,s) f(y) dy \; .
$$
The operation is {\em linear}, which means that
$G(af^{(1)} + bf^{(2)}) = aGf^{(1)} + bGf^{(2)}$.
The family of operators $G(s)$ for $s>0$ produces the solution to the
backward equaion, so we call $G(s)$ the {\em solution operator} for time $s$.
\para Duhamel's principle:
The {\em inhomogeneous} backward equation
\begin{equation}
\partial_t f + \partial_x^2 f = V(x,t) \; ,
\label{BEin} \end{equation}
with {\em homogeneous}\footnote{We often say ``homogeneous'' to mean zero
and ``inhomogeneous'' to mean not zero. That may be because if $V(x,t)$ is
zero then it is constant, i.e.\ the same everywhere, which is the usual
meaning of homogeneous.}
final condition $f(x,T) = 0$ may be solved by
$$
f(x,t) =
E_{x,t}\left[ \int_t^T V(X(t^{\prime}),t^{\prime}dt^{\prime})\right] \; .
$$
Exchanging the order of integration, we may write
\begin{equation}
f(x,t) = \int_{t^{\prime}=t}^T g(x,t,t^{\prime}) dt^{\prime} \; ,
\label{AddUp} \end{equation}
where
$$
g(x,t,t^{\prime}) = E_{x,t}\left[V(X(t^{\prime}))\right] \; .
$$
This $g$ is the expected value (at $(x,t)$) of a payout
($V(\cdot,t^{\prime})$ at time $t^{\prime}>t$).
As such, $g$ is the solution of a homogeneous final value problem with
inhomogeneous final values:
\begin{equation}
\left. \begin{array}{ll} \displaystyle
\partial_t g + \textstyle{\frac{1}{2}} \partial_x^2 g = 0 \;\; \mbox{for}
\;\; t < t^{\prime} \; ,\\
\\
g(x,t^{\prime}) = V(x,t^{\prime}) \; . \end{array} \right\}
\label{BEhom} \end{equation}
{\em Duhamel's principle}, which we just demonstrated, is as follows.
To solve the invonogeneous final value problem (\ref{BEin}), we solve a
homogeneous final value problem (\ref{BEhom}) for each $t^{\prime}$ between
$t$ and $T$ then we add up the results (\ref{AddUp}).
\para Infinitesimal generator:
There are matrices of many different types that play various roles in theory
and computation.
And so it is with operators.
In addition to the solution operator, there is the {\em infinitesimal
generator} (or simply {\em generator}).
For Brownian motion in one dimension, the generator is
\begin{equation}
L = \textstyle{\frac{1}{2}} \partial_x^2 \; .
\label{L} \end{equation}
The backward equation may be written
\begin{equation}
\partial_t f + Lf = 0 \; .
\label{OpBE} \end{equation}
For other diffusion processes, the generator is the operator $L$ that puts
the backward equation for process in the form (\ref{OpBE}).
Just as a matrix has a transpose, an operator has an {\em adjoint}, written
$L^*$.
The forward equation takes the form
$$
\partial_t u = L^* u \; .
$$
The operator (\ref{L}) for Brownian motion is {\em self adjoint}, which means
that $L^*=L$, which is why the operator $\frac{1}{2}\partial_x^2$ is the
appears in both. We will return to these points later.
\para Composing (multiplying) operators:
If $A$ and $B$ are matrices, then there are two ways to form the matrix $AB$.
One way is to multiply the matrices.
The other is to {\em compose} the linear transformations:
$f \to Bf \to ABf$. In this way, $AB$ is the composite linear transformation
formed by first applying $B$ then applying $A$.
We also can compose operators, even if we sometimes lack a good explicit
representation for the composite $AB$.
As with matrices, composition of operators is associative: $A(Bf) = (AB)f$.
\para Composing solution operators:
The solution operator $G(s_1$ moves the value function backward in time by
the amount $s_1$, which is written $f(t-s_1) = G(s_1)f(t)$.
The operator $G(s_2)$ moves it back an additional $s_2$, i.e.
$f(t-(s_1 + s_2)) = G(s_2)f(t-s_1)= G(s_2)G(s_1)f(t)$.
The result is to move $f$ back by $s_1 + s_2$ in total, which is the same
as applying $G(s_1 + s_2)$.
This shows that for every (allowed) $f$, $G(s_2)G(s_1)f = G(s_2+s_1)f$,.
which means that
\begin{equation}
G(s_2)G(s_1) = G(s_2+s_1) \; .
\label{semi} \end{equation}
This is called the {\em semigroup property}.
It is a basic property of the solution operator for any problem.
The matrix anologue for Markov chains is $P^{s_2+s+1} = P^{s_2}P^{s_1}$,
which is a basic fact about powers of matrices having nothing to do with
Markov chains. The property (\ref{semi}) would be called the {\em group}
property if we were to allow negative $s_2$ or $s_1$, which we do not.
Negative $s$ is allowed in the matrix version if $P$ is nonsingular.
There is no particular physical reason for the transition matrix of a
Markov chain to be non singular.
\para Operator kernels:
If matrix $A$ has elements $A_{jk}$, we can compute $g = Af$ by doing the
sum $g_j = \sum_k A_{jk}f_k$.
Similarly, operator $A$ may or may not have a
{\em kernel}\footnote{The term kernel also describes vectors $f$ with
$Af=0$, it is unfortunate that the same word is used for these different
objects.}, which is a function $A(x,y)$ so that $g=Af$ is represented
by
$$
g(x) = \int A(x,y) f(y) dy \; .
$$
If operators $A$ and $B$ both have kernels, then the composite operator has
the kernel
\begin{equation}
(AB)(x,y) = \int A(x,z)B(z,y) dz \; .
\label{Comp} \end{equation}
To derive this formula, set $g=Bf$ and $h=Ag$.
Then $h(x) = \int A(x,z) g(z) dz$ and $g(z) = \int B(z,y) f(y) dy$
implies that
$$
h(x) = \int \left( \int A(x,z) B(z,y) dz \right) f(y) dy \; .
$$
This shows that (\ref{Comp}) is the kernel of $AB$.
The formula is anologous to the formula for matrix multiplication.
\para The semigroup property:
When we defined (\ref{BEvFun}) the solution operators $G(s)$, we did so
by specifying the kernels
$$
G(x,t,s) = \frac{1}{\sqrt{2\pi s}}e^{-(x-y)^2/2s} \; .
$$
According to (\ref{Comp}). the semigroup property should be an integral
identity involving $G$.
The identity is
$$
G(x,y,s_2+s_1) = \int G(x,z,s_2)G(z,y,s_1) dz \; .
$$
More concretely:
\begin{eqnarray*}
\lefteqn{\frac{1}{\sqrt{2\pi(s_2+s_1)}} e^{-(x-y)^2/2(s_2+s_1)} } \\
&& =
\frac{1}{\sqrt{2\pi(s_2)}}\frac{1}{\sqrt{2\pi(s_1)}}
\int e^{-(x-z)^2/2s_2} e^{-(z-y)^2/2s_1} dz \; .
\end{eqnarray*}
The reader is encouraged to verify this by direct integration.
It also can be verified by recognizing it as the statement that adding
independent mean zero Gaussian random variables with variance
$s_2$ and $s_1$ respectively gives a Gaussian with variance $s_2+s_1$.
\para Fundamental solution:
The operators $G(t)$ form a
{\em fundamental solution}\footnote{We have adjusted this definition from
its original form in books on ordinary differential equations to accomodate
the backward evolution of the backward equation. This amounts to reversing
the sign of $L$.} for the problem $f_t + Lf = 0$ if
\begin{equation}
\partial_t G = LG \;\; , \;\;\mbox{for} \;\; t>0 \; ,
\label{OpEv} \end{equation}
\begin{equation}
G(0) = I \; .
\label{GI} \end{equation}
The property (\ref{OpEv}) really means that
$\partial_t \bigl( G(t) f \bigr) = L \bigl( Gf \bigr)$ for any $f$.
If $G(t)$ has a kernel $G(x,y,t)$, this in turn means (as the reader
should ckeck) that
\begin{equation}
\partial_t G(x,y,t) = L_x G(x,y,t) \; ,
\label{GEqn} \end{equation}
where $L_x$ means that the derivatives on $L$ are with respect to the $x$
variables in $G$.
In our case with $G$ being the {\em heat kernel}, this is
$$
\partial_t \frac{1}{\sqrt{2\pi t}} e^{-(x-y)^2/2t}
= \textstyle{\frac{1}{2}} \partial_x^2 \frac{1}{\sqrt{2\pi t}} e^{-(x-y)^2/2t} \; ,
$$
which we have checked and rechecked.
Without matrices, we still have the identity operator: $If = f$ for all $f$.
The property (\ref{GI}) really means that $G(t)f \to f$ as $t \to 0$.
It is easy to verify this for our heat kernel provided that $f$ is
continuous.
\para Duhamel with fundamental solution operator:
The $g$ appearing in (\ref{AddUp}) may be expresses as
$g(t,t^{\prime}) = G(t^{\prime} - t)V(t^{\prime})$, where $V(t^{\prime})$ is
the function with values $V(x,t^{\prime})$.
This puts (\ref{AddUp}) in the form
\begin{equation}
f(t) = \int_t^T G(t^{\prime}-t) V(t^{\prime}) dt^{\prime} \; .
\label{DuOp} \end{equation}
We illustrate the properties of the fundamental solution operator by verifying
(\ref{DuOp}) directly.
We want to show taht (\ref{DuOp}) implies that $\partial_t f + Lf = V(t)$
and $f(T) = 0$.
The latter is clear.
For the former we compute $\partial_t f(t)$ by differentiating the
right side of (\ref{DuOp}):
$$
\partial_t \int_t^T G(t^{\prime} - t) V(t^{\prime}) dt^{\prime} =
-G(t - t) V(t) -
\int_t^T G^{\prime}(t^{\prime} - t) V(t^{\prime}) dt^{\prime} \; ,
$$
We write $G^{\prime}(t)$ to represent $\partial_t G(t)$.
This allows us to write
$\partial_t G(t^{\prime}-t) = -G^{\prime}(t^{\prime} - t)= -LG(t^{\prime}-t)$.
Continuing, the left side is
$$
-V(t) - \int_t^T LG(t^{\prime}-t) V(t^{\prime}) dt^{\prime} =
-V(t) - \int_t^T LG(t^{\prime}-t)V(t^{\prime}) dt^{\prime} \; .
$$
If we take $L$ outside the integral on the right, we recognize what is
left in the integral as $f(t)$.
Altogether, we have $\partial_t f = -V(t) - Lf(t)$.
This is almost right, I just have to fix the minus sign somehow.
\para Green's function:
Consider the solution formula for the homogeneous final value problem
$\partial_t f + Lf = 0$, $f(T) = V$:
\begin{equation}
f(x,t) = \int G(x,y,T-t) V(y) dy \; .
\label{last} \end{equation}
Consider a special ``jackpot'' payout $V(y) = \delta(y - x_0)$.
If you like, you can think of $V(y) = \frac{1}{\epsilon}$ when
$\left| y=x_0\right|<\epsilon$ then let $\epsilon \to 0$.
We then get $f(x,t) = G(x,x_0,T-t)$.
The function that satisfies $\partial_t G + L_xG = 0$,
$G(x,T = \delta(x - x_0)$ is called the
{\em Greens's function}\footnote{This is in honor of a $19^{th}$ century
Englishman named Green.}.
The Green's function represents the result of a {\em point mass} payout.
A general payout can be expressed as a sum (integral) of point mass payouts
as $x_0$ with weight $V(x_0)$:
$$
V(y) = \int V(x_0) \delta (y - x_0) dx_0 \; .
$$
Since the backward equation is linear, the general value function will be
the weighted sum (integral) of the point mass value functions, which
is the formula (\ref{last}).
\para More generally:
Brownian motion is special in that $G(x,y,t)$ is a function of $x-y$.
This is because Brownian motion is translation invariant: a Brownian
motion starting from any point looks like a Brownian motion starting from
any other point.
Brownian motion is also special in that the forward equation and backward
equations are nearly the same, having the same spatial operator
$L = \frac{1}{2} \partial_x^2$.
More general diffusion processes loose both these properties.
The solution operator depends in a more complicated way on $x$ and $y$.
The backward equation is $\partial_t f + Lf = 0$ but the forward equation
is $\partial_t u = L^* u$.
The Green's function, $G(x,y,t)$, is the fundamental solution for the
backward equation in the $x,t$ variables with $y$ as a parameter.
It also is the fundamental solution to the forward equation in the
$y,t$ variables with $x$ as a parameter.
This material will be in a future lecture.
\end{document}