1
0
Fork 0
mathematics-physics-wiki/docs/en/mathematics/multivariable-calculus/differentation.md
2024-01-03 13:46:55 +01:00

5.5 KiB

Differentation

Generalization of derivatives to higher dimensions:

  • limit of difference quotient: partial derivatives,
  • linearization: total derivative.

Partial derivatives

Definition: let D \subseteq \mathbb{R}^n (n=2 for simplicity) and let f: D \to \mathbb{R} and \mathbf{a} \in D, if the limit exists the partial derivates of f are

\begin{align*} &\partial_1 f(\mathbf{a}) := \lim_{h \to 0} \frac{f(a_1 + h, a_2) - f(\mathbf{a})}{h}, \ &\partial_2 f(\mathbf{a}) := \lim_{h \to 0} \frac{f(a_1, a_2 + h) - f(\mathbf{a})}{h}. \end{align*}

Theorem: suppose that two mixed $n$th order partial derivatives of a function f involve the same differentations but in different orders. If those partials are continuous at a point \mathbf{a} and if f and all partials of f of order less than n are continuous in a neighbourhood of \mathbf{a}, then the two mixed partials are equal at the point \mathbf{a}. We have for n=2

\partial_{12} f(P) = \partial_{21} f(P),

??? note "Proof:"

Will be added later.

Total derivatives

Definition: let D \subseteq \mathbb{R}^n (n=2 for simplicity) and let f: D \to \mathbb{R}, determining an affine linear approximation of f around \mathbf{a} \in D

p(\mathbf{x}) = f(\mathbf{a}) + \big\langle L,; \mathbf{x} - \mathbf{a} \big\rangle,

with f(\mathbf{x}) = p(\mathbf{x}) + r(\mathbf{x}) demand \frac{r(\mathbf{x})}{\|\mathbf{x} - \mathbf{a}\|} \to 0 when \mathbf{x} \to \mathbf{a}.

if L \in \mathbb{R}^2 exists to satisfy this, then f is called totally differentiable in \mathbf{a}.

Theorem: if f is totally differentiable in \mathbf{a}, then f is partially differentiable in \mathbf{a} and the partial derivatives are

\partial_1 f(\mathbf{x}) = L_1, \qquad \partial_2 f(\mathbf{x}) = L_2,

obtaining

p(\mathbf{x}) = f(\mathbf{a}) + \big\langle \nabla f(\mathbf{a}),; \mathbf{x} - \mathbf{a} \big\rangle.

with \nabla f(\mathbf{a}) the gradient of f.

??? note "Proof:"

Will be added later.

Chain rule

Definition: let D \subseteq \mathbb{R}^n (n=2 for simplicity) and let f: D \to \mathbb{R}, also let g: \mathbb{R} \to \mathbb{R} given by

g(t) = f\big(\mathbf{x}(t)\big),

if f is continuously differentiable, then g is differentiable with

g'(t) = \big\langle \nabla f\big(\mathbf{x}(t)\big),; \mathbf{\dot x}(t) \big\rangle.

Gradients

Definition: at any point \mathbf{x} \in D where the first partial derivatives of f exist, we define the gradient vector \nabla by

\nabla f(\mathbf{x}) = \begin{pmatrix} \partial_1 f(\mathbf{x}) \ \partial_2 f(\mathbf{x}) \end{pmatrix}.

The direction of the gradient is the direction of steepest increase of f at \mathbf{x}.


Theorem: gradients are orthogonal to level lines and level surfaces.

??? note "Proof:"

let $\mathbf{r}(t) = \big(x(t),\; y(t) \big)^T$ be a parameterization of the level curve of $f$ such that $\mathbf{r}(0) = \mathbf{a}$. Then for all $t$ near $0$, $f(\mathbf{r}(t)) = f(\mathbf{a})$. Differentiating this equation with respect to $t$ using the chain rule, we obtain

$$
    \partial_1 f(\mathbf{x}) \dot x(t) + \partial_2 f(\mathbf{x}) \dot y(t) = 0,
$$

at $t=0$, we can rewrite this to

$$
    \big\langle \nabla f(\mathbf{a}),\; \mathbf{\dot r}(0) \big\rangle = 0,
$$

obtaining that $\nabla f$ is orthogonal to $\mathbf{\dot r}$. 

Directional derivatives

Definition: let D \subseteq \mathbb{R}^n and let f: D \to \mathbb{R} with \mathbf{v} \in D and \|\mathbf{v}\| = 1 a unit vector. The directional derivative is then the change of f near a point \mathbf{a} \in D in the direction of \mathbf{v}


D_\mathbf{v} f(\mathbf{a}) = \big\langle \mathbf{v},; \nabla f(\mathbf{a}) \big\rangle.

The general case

Definition: let D \subseteq \mathbb{R}^n and let \mathbf{f}: D \to \mathbb{R}^m, with f_i: D \to \mathbb{R}, with i = 1, \dotsc, m being the components of \mathbf{f}.

  • \mathbf{f} is continuous at \mathbf{a} \in D \iff all f_i continuous at \mathbf{a},
  • \mathbf{f} is partially/totally differentiable at \mathbf{a} \iff all f_i are partially/totally differentiable at \mathbf{a}.

The linearization of every component f_i we have

f_i(\mathbf{x}) = f_i(\mathbf{a}) + \big\langle \nabla f_i(\mathbf{a}),; \mathbf{x} - \mathbf{a} \big\rangle + r_i(\mathbf{x}),

so in total we have

\mathbf{f}(\mathbf{x}) = \mathbf{f}(\mathbf{a}) + D\mathbf{f}(\mathbf{a}) \big(\mathbf{x} - \mathbf{a}\big) + \mathbf{r}(\mathbf{x}),

with D\mathbf{f}(\mathbf{a}) the Jacobian of \mathbf{f}.

Definition: the Jacobian is given by \big[D\mathbf{f}(\mathbf{a}) \big]_{i,\;j} = \partial_j f_i(\mathbf{a}).

Chain rule

Let D \subseteq \mathbb{R}^n and let E \subseteq \mathbb{R}^m be sets and let \mathbf{f}: D \to \mathbb{R}^m and let \mathbf{g}: E \to \mathbb{R}^k with \mathbf{f} differentiable at \mathbf{x} and \mathbf{g} differentiable at \mathbf{f}(\mathbf{x}). Then D\mathbf{f}(\mathbf{x}) \in \mathbb{R}^{m \times n} and D\mathbf{g}\big(\mathbf{f}(\mathbf{x})\big) \in \mathbb{R}^{k \times m}.

Then if we differentiate \mathbf{g} \circ \mathbf{f} we obtain

D(\mathbf{g} \circ \mathbf{f})(\mathbf{x}) = D\mathbf{g}\big(\mathbf{f}(\mathbf{x})\big) D\mathbf{f}(\mathbf{x}).

We have two interpretations:

  • the composition of linear maps,
  • the matrix multiplication of the Jacobian.

??? note "Proof:"

Will be added later.