(in progress) Some background of Mallivian Calculus
This is a long post distilling some concepts of Malliavin Calculus and based on the lecture note of Martin Hairer.
Motivation Malliavin calculus is a modern tool tackling with differentiating random variable defined on a Gaussian probability space w.r.t. the underlying noise.
At the moment, I feel like White Noise Theory and Malliavian calculus share some similarity. They surely complement each other but I do not comprehend the difference between them, for example, what one can do but other cannot. I also plan to get to know more some background of rough path but this will be in another post.
Stochastic analysis centers around stochastic differential equations, i.e.,
\[dX_t = V_0(X_t)dt + \sum_{i=1}^m V_i(X_t) \circ dW_i(t)\]where $\circ dW_t$ denotes Straonovich integration. In this equation, Hairer considers multiple noise parts.
This section concerns the definition of white noise under functional representation. Wiener chaos gives the decomposition form of white noise in which we will find somewhat similar to Fourier analysis or Sobolev space.
Now, let’s talk about spaces we will work on
White noise is linear isometry
The above is just the definition. How to establish such map will be shown next.
Orthonormal basis Here, we define
When representing $h = \sum_n h_n e_n \in H$, we construct $W(h)=\sum_n h_n \xi_n$. The normal random variable $\xi_n$ is now can rewrite in the functional form form $\xi_n = W(e_n)$.
In the lecture, $m$-dimensional Wiener process is defined by using funtion $\mathbf{1}_{[0,t)}^{(i)}$
\[(\mathbf{1}_{[0,t)}^{(i)})_j(s) = \begin{cases} 1 &\text{if } s \in [0, t) \text{ and } j=i \\ 0 &\text{otherwise } \end{cases}\]Note that 1-dimensional case is easy to show, but for now, still follow the setup in the lecture.
The $i$-th dimension of Wiener process is defined as $W_i(t) = W(\mathbf{1}_{[0,t)}^{(i)})$ where we can check the covariance
\[\mathbb{E}[W_i(t)W_j(s)] = \langle \mathbf{1}_{[0,t)}^{(i)}, \mathbf{1}_{[0,s)}^{(j)} \rangle = \delta_{ij}(t \wedge s).\]For arbitrary $h$, $W(h)$ can represent as (Wiener-Ito integral)
\[W(h) = \sum_{i=1}^m \int_0^\infty h_i(s) dW_i(s)\]Note that we may need to clearly differentiate between $W(h)$ and $W_i(s)$.
There are several formulation of Hermite polynomial
Some properties of Hermite polynomials:
This recursive leads to $\mathbb{E}[H_n(X)H_m(X)] = n! \delta_{n.m}$
Let’s define linear subspaces of $L^2(\Omega, \mathbb{P})$ as
\[\mathcal{H} = \{H_n(W(h)), h \in H, \lvert\lvert h \rvert\rvert_H = 1\}\]We have the following decomposition
Theorem Let $\mathcal{F}$ be the $\sigma$-algebra generated by $W$. Then,
Proof Let $X \in L^2(\Omega, \mathcal{F}, \mathbb{P})$ be orthogonal to $\mathcal{H}_n$ for all $n$.
\[\mathbb{E}[XH_n(W(h))] = 0, \forall n \quad \Rightarrow \quad \mathbb{E}[X\exp(W(h))] = 0\]We need to show that $X=0$. Splitting $X = X^+ - X^-$, and define the following measures
\[\nu^{+,-} = \mathbb{E}[X^{+,-} \mathbf{1}_B(W(h_1), \dots, W(h_m))], \quad B \in \mathcal{B}(\mathbb{R}^m)\]Applying Laplace transform for $\nu$, we deduce
\[\varphi_{\nu^{+,-}}(\lambda) = \int \exp(\lambda \cdot x) \nu^{+,-}(dx) = \mathbb{E}[X^{+,-}\exp(\sum_i \lambda_i W(h_i))] = 0\]As the Laplace is zero, then the measure is zero. Thus, $\mathbb{E}[X\mathbb{1}_F] = 0, \forall F \in \mathcal{F}$. Therefore, $X=0$ and we can conclude the proof.
This section defines multiple Wiener-Ito integral w.r.t. Brownian motion. With this definition we can lead to a similar decomposition like the representation of Hermite polynomials presented above.
Consider
Like traditional stochastic calculus, this tries to set up a corner stone with elementary process
\[\mathcal{E} = \{u(t) = \sum_i F_i \mathbf{1}_{(t_i, t_{i+1}]}(t), t1 < \dots < t_{n+1}, t_i \in T, F_i \in \mathcal{F}_{t_i} \text{square integrable} \}\]The Ito integral w.r.t. Brownian motion is
\[\int_T u(t) dB(t) = \sum_i F_i(B(t_{i+1}) - B(t_i))\]Definition of multiple Wiener-Ito integral This is a multiple dimensional integral
\[\int_{T^n} f(t_1, t_2, \dots, t_n) dB(t_1)dB(t_2)\dots dB(t_n)\] \[I_n(f) = \sum_{i_1, \dots, i_n} a_{i_1 \dots i_n} \xi_{i_1} \dots \xi_{i_n}\]Looking at this equation, one may think that order of $i_1,\dots, i_n$ may affect to $a$ but not for the product of $\xi$. So, the symmetrized version is defined (because $I_n$ is linear as well)
\[\tilde{f}(t_1, \dots, t_n) = \frac{1}{n!} \sum_{\sigma \in \mathcal{S}_n} f(t_{\sigma(1)}, \dots, t_{\sigma(n)})\]with $\mathcal{S}_n$ is the set of all permutations. Because $dt_1…dt_n$ is symmetry, we have
\[\int_{T^n} |f(t_1, \dots, t_n)|^2dt_1...dt_n = \int_{T^n}f(t_{\sigma(1)}, \dots, t_{\sigma(n)}) dt_1...dt_n\]Using the triangle inequality, we have
\[\lvert\lvert \tilde{f}\rvert\rvert_{L^2(T^n)} \leq \frac{1}{n!} \sum_{\sigma \in \mathcal{S}_n} \lvert\lvert {f}\rvert\rvert_{L^2(T^n)} = \lvert\lvert {f}\rvert\rvert_{L^2(T^n)}\]We can say the Wiener-Ito integral of $f$ and $\tilde{f}$ are the same
Lemma If $f \in \mathcal{E}_n$, elementary process, then $I_n(f) = I_n(\tilde{f})$
This is quite easy to see if considering the symmetry of $\prod_i (B(t_i^{(2)} - B(t_i^{(1)}))$. The permutation version of this will have the same result.
Next, the following is the orthogonal property.
Lemma If $f, g \in \mathcal{E}_n$, elementary process, then
\(\mathbb{E}[I_n(f)] = 0, \quad \quad \mathbb{E}[I_n(f)I_m(g)] = \begin{cases} 0, \quad& n \neq m \\
n! \langle \tilde{f}, \tilde{g} \rangle_{L^2(T^n)}, &n=m \end{cases}\)
The first expectation is straightforward because using definition of elementary process and expect of Brownian motion
The second expectation needs to be treated more carefully. By the definition, this product will be the product of two summations, only the case that $\mathbb{E}[\xi^2] = \Delta t$ (basic Brownian motion property) remains, explaining when $n\neq m$, the expectation vanishes.
Continuing with defining the Wiener-Ito integral on $L^2{T^n}$ instead of elemetary process space, the general steps are based on a sequence of ${f_k} \in \mathcal{E}_n$ converging to $f \in L^2{T^n}$. This leads to the convergence in probability of expecation of $I_n(f)$.
Goal: Rigorously define differentation w.r.t. white noise.
In Wiener process, we usually encounter that its derivative is a Gaussian noise, $\xi_i(t) = \frac{dW_i}{dt}$
The new operator $D_t^{(i)}$ takes derivative of a random variable w.r.t to $\xi_i(t)$. We may expect this operator works as
\[D_t^{(i)} W(h) = h_i(t)\]It is because
\[W(h) = \sum_{i=1}^m \int_0^\infty h_i(t)\xi_i(t) dt.\]We also expect the chain rules
\[D_t^{(i)} F(X1, \dots, X_n) = \sum_{k=1}^n \partial_k F(X_1, \dots, X_n)D_t^{(i)}X_k\]In fact, the definition of $\mathscr{D}F$ can be interpreted as a directional derivative
\[\langle DF, h \rangle = \lim_{\epsilon \to 0}\frac{1}{\epsilon} (F(W(h_1) + \epsilon \langle h_1, h\rangle, \dots, W(h_n) + \epsilon \langle h_n, h\rangle) - F)\]
Proposition (Integration by parts) For every $X$, $h$, one has the identity
\(\mathbb{E}[\langle {D}X, h\rangle_H] = \mathbb{E}[XW(h)]\)
Proof It is okay to consider only the case $\lvert\lvert h \rvert\rvert_H=1$.Suppose orthonormal basis ${e_1, \dots, e_n}$ of $H$ such that $h=e_1, F = f(W(e_1), \dots, W(e_n))$
Given $\phi(x)$ denoting standard normal distribtution, we have
\(\mathbb{E}[\langle DF, h \rangle_H] = \int \partial_1 f(x)\phi(x)dx = \int f(x)\phi(x)x_1 dx = \mathbb{E}[FW(e_1)]= \mathbb{E}[FW(h)]\)
The second equation used integration by part.
The following result uses $D(GF) = (DG)F + G(DF)$ (something like chain rule).
Lemma Let $F, G \in \mathcal{S}$ and $h \in H$
\(\mathbb{E}[G\langle {D}F, h\rangle_H] = -\mathbb{E}[F\langle DG, h \rangle_H] + \mathbb{E}[FGW(h)]\)
Proposition [Chain rule] Let $g: \mathbb{R}^d \to \mathbb{R}$ be a function in $\mathcal{C}^1$ with bounded partial deriviatives. Let $p\geq 1$ and $F = (F^1, \dots, F^d), F^i \in \mathbb{D}^{1,d}$. Then $g(F) \in \mathbb{D}^{1, p}$ and
\[D(g(F)) = \sum_{i=1}^d \partial_i g(F)DF^i\]Consider the case of one-dimensional Brownian motion $B(t), t \in T = [a, b], H = L^2(T)$. The functional $W(h) = \int_a^b h(s) dB(s)$
Proposition $F = \sum_{n=0}^\infty I_n(f_n(\cdot, t))$ \(D_tF = \sum_{n=1}^\infty n I_{n-1}(f_n(\cdot, t)).\)
Proof We also start with elementary process where $f_n \in \mathcal{E}_n$ symmetric. Consider a really simple case $F = I_n(f_n)$
Proposition Let $g: \mathbb{R}^d \to R$ be a Lipschitz function ($\lvert g(x) - g(y)\rvert \leq K \lvert\lvert x- y\rvert\rvert$). Suppose $F = (F^1, \dots, F^d)$ is a random vector such that $F^i \in \mathbb{D}^{1,2}$. Then $g(F) \in \mathbb{D}^{1,2}$ and there exists a random vector $G=(G_1, \dots, G_d)$ such that
\[D(g(F)) = \sum_{i=1}^d G_i DF^i.\]In short, divergence operator is defined as the dual (adjoint) of the derivative operator defined in the previous section.
The divergence operator is denoted as $\delta$ which is unbounded, $\delta: L^2(\Omega; H) \to L^2(\Omega)$, satisfying
The domain of $\delta$, $\text{Dom} \delta$, contains $u \in L^2(\Omega; H)$ such that $\lvert \mathbb{E}[\langle DF , u \rangle_H] \rvert \leq c_u \lvert \lvert F \rvert\rvert_{L^2(\Omega)}$
Duality relation: $\mathbb{E}[F\delta(u)] = \mathbb{E}[\langle DF, u \rangle_H]$
Proposition[Properties of divergence]
This part will consider the restricted case which is Brownian motion. This makes the divergence $\delta(u)$ now is the Skorohod integral.
Consider the Wiener chaos expansion
\[u(t) = \sum_n I_n(f_n(\cdot, t))\]The Skorohod integral will be represented as
\[\delta(u) = \sum_{n=0}^\infty I_{n+1}(\tilde{f}_n)\]converging in $L^2(\Omega)$ where
$$ \tilde{f}_n(t_1, \dots, t_n, t) = \frac{1}{n+1} (f_n(t_1, \dots, t_n, t) + \sum_{i=1}^n f_n(t_1, \dots, t_{i-1}, t, t_{i+1}, \dots, t_n, t_i)) $$
Proposition[Skorohod integral is Ito integral] $\delta(u)$ coincides with the Ito integral w.r.t. Brownian mtion, that is
\[\delta(u) = \int_{a}^b u(s)dB(s)\]Proof Consider an elementary adapted process
\[u_t = \sum_j F_j \mathbf{1}_{(t_j, t_{j+1})}(t)\]Now looking at each of component in the sum
\[\delta(F_j \mathbf{1}_{(t_j, t_{j+1})}(\cdot)) = F_j \delta(\mathbf{1}_{(t_j, t_{j+1})}(\cdot)) - \int_t D_t F_j \mathbf{1}_{(t_j, t_{j+1})}(t) dt = F_j(B(t_{j+1}) - B(t_j))\]Given $F$, exist $u$ such that \(F = \mathbb{E}[F] + \int_0^\infty u(t)dB(t)\)
This result says that a stochastic process can be represented by its mean which is a deterministic part and a randomness part.
Proof First, consider zero-mean integrable random variable $G$ that is orthogonal to all stochastic integrals $\int_{\mathbb{R}_+} u(t) dB(t)$.
Let $M_u(t) = \exp(\int_0^t u(s) dB(s) - \frac{1}{2}\int_0^t u^2(s)ds)$. By Ito’s formula
\[M_u(t) = M_u(0) + \int_0^t M_u(s) u(s) dB(s)\]Hence, such random variable $G$ is orthogonal to
\[\mathcal{E}(h) = \exp\left(\int_0^\infty h(s) dB(s) - \frac{1}{2} \int_0^\infty h^2(s) ds\right)\]And ${ e^{W(h)}, h \in L^2(\mathbb{R}_+) }$ form a total subset of $L^2(\Omega)$, this leads to the desired conclusion.
This section provide the foundation of the integration by parts in Malliavin calculus. This will help
Proposition Let $F, G$ be two random variables such that $F \in \mathbb{D}^{1,2}$
where $H(F, G) = \delta(Gu(\langle DF, u\rangle_H)^{-1})$ .
The main focus of this theorem is to show there is a unique solutions for SDEs under some conditions.
Consider the following setup:
Let $X(t)$ be the solution of $d$-dimensional systems of SDEs
\[dX_i(t) = \sum_{j=1}^d \sigma_{ij}(X(t))dB^j(t) + b_i(X(t))dt, \quad X_i(0) = x_0^i, \quad i = 1,\dots, d\]Theorem There exists a unique continuous soltion and the following expectation is bounded
\[\mathbb{E}\left[\sup_{0\leq t \leq T} \lvert X(t)\rvert^p\right] \leq C\]for any $p \geq 2$, where $C = C(p, T, K)> 0$
Theorem The derivative $D^j_rX_i(s))$
\[D^j_rX_i(t)) = \sigma_{ij}(X(r)) + \sum_{k,l=1}^d \int_r^t \partial_k \sigma_{il}(X(s))D^j_s(X_k(s))dB^l(s) + \sum_{k=1}^d\int_r^t \partial_k b_i(X(s))ds\]Some notations:
With these notation, the above SDE can be defined with a Stratonovich integral
\[X(t) = X_0 + \sum_{j = 1}^d \int_0^t \sigma(X(s)) \circ dB^j(s) + \int_0^t \sigma_0(X(s))ds\]Holder condition This is a vector space spanned by the vector filed
\[\mathbf{(H)} = \text{span} \{\sigma_1, \dots, \sigma_d, [\sigma_i, \sigma_j], [\sigma_i, [\sigma_j, \sigma_k]]\}\]Theorem Assume that Hormander’s condition $\mathbf{(H)}$ holds and the coefficients of SDE are finitely differentiable. Then for any $t > 0$, $X(t)$ has an infintely differentiable density.
The proof is based on the quadratic varation is large, then the semimartingale is small with an exponentially small probability.
This part will focus on how to use Malliavin calculus in mathematical finance. Again, the main concern when I read this section is that the benefit of using Malliavin calculus over Ito calculus. The first three subsections contains some introductory background. The remaining subsections discussed the actual use of Malliavin calculus.
This will give a brief introduction of options. An option is a contract, or right to buy (put) or sell (call) an amount of assets. Some terminologies related to this concept are
There are two ways of exercising options
The value of put or call are decided by
\[C_T = \max(S_T - K, 0), \quad P_T = \max(K - S_T, 0)\]To further work with these values, we take into account their neural risk which involves analyzing their statistical estimation (mostly expectation) w.r.t. market randomness.
Two main questions that might be interesting are
The Black-Scholes model is very well-known in quantitative finance field, helping us to understand of the market dynamics. The downside, however, is that the model has restricted assumptions, therefore, usually is used for pedagodical purposes.
\[dS_t = S_t \mu dt + S_t\sigma dB_t\]Ito calculus allows us to have a close-form solution for this SDE
\[S_t = S_0 \exp(\mu t - \frac{\sigma^2}{2}t + \sigma B_t)\]Therefore,
\[\mathbb{E}[S_t] = S_0\exp(\mu t), \quad \mathbb{E}[S_t^2] = S_0^2 \exp((2\mu + \sigma^2)t)\]There is an equivalence between the solution of Black-Scholes models and a partial diferential equation (PDE).
Theorem Let $h$ be a continuous function of at most linear growth. Assume that $v(t, y)$ is a regular solution of PDE
\[\begin{cases} \frac{1}{2}\sigma^2 y^2 \frac{\partial^2 v}{\partial y^2} + ry \frac{\partial v}{\partial y} + \frac{\partial v}{\partial t} - r v = 0 \\ v(T, y) = h(y) \end{cases}\]There exists a portfolio with value $v(t, S_t)$ at time $t$ replicated flow $h(S_T)$. And the value of this hedging portfoliio is given by $\beta(t, S_t) = \frac{\partial v}{\partial y}(t, S_t)$.
Let’s take a moment to think how to interpret this theorem. Looking at the boundary condition, this means that we may expect that at maturity $T$, the solution $v$ should agree with the function $h$. And the solution $v(t, y)$ on $(0, T)$ describes the dynamics of $v$ along the interval. This is the backward solution because we start at $T$ and go back to $0$.
Proof
Using Ito's formula to $v(t, S_t)$
\[dv(t, S_t) = \frac{\partial v}{\partial t}(t, S_t) dt + \frac{\partial v}{\partial y}(t, S_t) dS_t + \frac{1}{2}\sigma^2S_t^2\frac{\partial^2 v}{\partial y^2}(t, S_t) dt\]The above is purely a mathematical derivation. On the hand, managing portfolio requires to
\[dv(s, S_t) = v(t, S_t)rdt + \beta(t, S_t)(dS_t - rS_tdt)\]Picking $\beta(t, S_t) = \frac{\partial v}{\partial y}(t, S_t)$, the part with $dS_t$ vanishes, the remain will be reduced to
\[\frac{\partial v}{\partial t}(t, S_t) + \frac{1}{2}\sigma^2S_t^2\frac{\partial^2 v}{\partial y^2}(t, S_t) = v(t, S_t)rdt - rS_t\frac{\partial v}{\partial y}(t, S_t)\]And we obtain the expect PDE.
Consider the price of an option $V_0$ with strike $K$ and maturity $t$.
The most crucial parameters are $(x, r, \sigma, T, K)$
People working in finance are interested in obtaining some quanities named after some characters in Greek alphabet:
These Greeks will be computed using integration by parts of Mallivian calculus.