5.5 KiB
Differentation
Generalization of derivatives to higher dimensions:
- limit of difference quotient: partial derivatives,
- linearization: total derivative.
Partial derivatives
Definition: let D \subseteq \mathbb{R}^n
(n=2
for simplicity) and let f: D \to \mathbb{R}
and \mathbf{a} \in D
, if the limit exists the partial derivates of f
are
\begin{align*}
&\partial_1 f(\mathbf{a}) := \lim_{h \to 0} \frac{f(a_1 + h, a_2) - f(\mathbf{a})}{h}, \
&\partial_2 f(\mathbf{a}) := \lim_{h \to 0} \frac{f(a_1, a_2 + h) - f(\mathbf{a})}{h}.
\end{align*}
Theorem: suppose that two mixed $n$th order partial derivatives of a function f
involve the same differentations but in different orders. If those partials are continuous at a point \mathbf{a}
and if f
and all partials of f
of order less than n
are continuous in a neighbourhood of \mathbf{a}
, then the two mixed partials are equal at the point \mathbf{a}
. We have for n=2
\partial_{12} f(P) = \partial_{21} f(P),
??? note "Proof:"
Will be added later.
Total derivatives
Definition: let D \subseteq \mathbb{R}^n
(n=2
for simplicity) and let f: D \to \mathbb{R}
, determining an affine linear approximation of f
around \mathbf{a} \in D
p(\mathbf{x}) = f(\mathbf{a}) + \big\langle L,; \mathbf{x} - \mathbf{a} \big\rangle,
with f(\mathbf{x}) = p(\mathbf{x}) + r(\mathbf{x})
demand \frac{r(\mathbf{x})}{\|\mathbf{x} - \mathbf{a}\|} \to 0
when \mathbf{x} \to \mathbf{a}
.
if L \in \mathbb{R}^2
exists to satisfy this, then f
is called totally differentiable in \mathbf{a}
.
Theorem: if f
is totally differentiable in \mathbf{a}
, then f
is partially differentiable in \mathbf{a}
and the partial derivatives are
\partial_1 f(\mathbf{x}) = L_1, \qquad \partial_2 f(\mathbf{x}) = L_2,
obtaining
p(\mathbf{x}) = f(\mathbf{a}) + \big\langle \nabla f(\mathbf{a}),; \mathbf{x} - \mathbf{a} \big\rangle.
with \nabla f(\mathbf{a})
the gradient of f
.
??? note "Proof:"
Will be added later.
Chain rule
Definition: let D \subseteq \mathbb{R}^n
(n=2
for simplicity) and let f: D \to \mathbb{R}
, also let g: \mathbb{R} \to \mathbb{R}
given by
g(t) = f\big(\mathbf{x}(t)\big),
if f
is continuously differentiable, then g
is differentiable with
g'(t) = \big\langle \nabla f\big(\mathbf{x}(t)\big),; \mathbf{\dot x}(t) \big\rangle.
Gradients
Definition: at any point \mathbf{x} \in D
where the first partial derivatives of f
exist, we define the gradient vector \nabla
by
\nabla f(\mathbf{x}) = \begin{pmatrix} \partial_1 f(\mathbf{x}) \ \partial_2 f(\mathbf{x}) \end{pmatrix}.
The direction of the gradient is the direction of steepest increase of f
at \mathbf{x}
.
Theorem: gradients are orthogonal to level lines and level surfaces.
??? note "Proof:"
let $\mathbf{r}(t) = \big(x(t),\; y(t) \big)^T$ be a parameterization of the level curve of $f$ such that $\mathbf{r}(0) = \mathbf{a}$. Then for all $t$ near $0$, $f(\mathbf{r}(t)) = f(\mathbf{a})$. Differentiating this equation with respect to $t$ using the chain rule, we obtain
$$
\partial_1 f(\mathbf{x}) \dot x(t) + \partial_2 f(\mathbf{x}) \dot y(t) = 0,
$$
at $t=0$, we can rewrite this to
$$
\big\langle \nabla f(\mathbf{a}),\; \mathbf{\dot r}(0) \big\rangle = 0,
$$
obtaining that $\nabla f$ is orthogonal to $\mathbf{\dot r}$.
Directional derivatives
Definition: let D \subseteq \mathbb{R}^n
and let f: D \to \mathbb{R}
with \mathbf{v} \in D
and \|\mathbf{v}\| = 1
a unit vector. The directional derivative is then the change of f
near a point \mathbf{a} \in D
in the direction of \mathbf{v}
D_\mathbf{v} f(\mathbf{a}) = \big\langle \mathbf{v},; \nabla f(\mathbf{a}) \big\rangle.
The general case
Definition: let D \subseteq \mathbb{R}^n
and let \mathbf{f}: D \to \mathbb{R}^m
, with f_i: D \to \mathbb{R}
, with i = 1, \dotsc, m
being the components of \mathbf{f}
.
\mathbf{f}
is continuous at\mathbf{a} \in D
\iff
allf_i
continuous at\mathbf{a}
,\mathbf{f}
is partially/totally differentiable at\mathbf{a}
\iff
allf_i
are partially/totally differentiable at\mathbf{a}
.
The linearization of every component f_i
we have
f_i(\mathbf{x}) = f_i(\mathbf{a}) + \big\langle \nabla f_i(\mathbf{a}),; \mathbf{x} - \mathbf{a} \big\rangle + r_i(\mathbf{x}),
so in total we have
\mathbf{f}(\mathbf{x}) = \mathbf{f}(\mathbf{a}) + D\mathbf{f}(\mathbf{a}) \big(\mathbf{x} - \mathbf{a}\big) + \mathbf{r}(\mathbf{x}),
with D\mathbf{f}(\mathbf{a})
the Jacobian of \mathbf{f}
.
Definition: the Jacobian is given by \big[D\mathbf{f}(\mathbf{a}) \big]_{i,\;j} = \partial_j f_i(\mathbf{a}).
Chain rule
Let D \subseteq \mathbb{R}^n
and let E \subseteq \mathbb{R}^m
be sets and let \mathbf{f}: D \to \mathbb{R}^m
and let \mathbf{g}: E \to \mathbb{R}^k
with \mathbf{f}
differentiable at \mathbf{x}
and \mathbf{g}
differentiable at \mathbf{f}(\mathbf{x})
. Then D\mathbf{f}(\mathbf{x}) \in \mathbb{R}^{m \times n}
and D\mathbf{g}\big(\mathbf{f}(\mathbf{x})\big) \in \mathbb{R}^{k \times m}
.
Then if we differentiate \mathbf{g} \circ \mathbf{f}
we obtain
D(\mathbf{g} \circ \mathbf{f})(\mathbf{x}) = D\mathbf{g}\big(\mathbf{f}(\mathbf{x})\big) D\mathbf{f}(\mathbf{x}).
We have two interpretations:
- the composition of linear maps,
- the matrix multiplication of the Jacobian.
??? note "Proof:"
Will be added later.