mathematics-physics-wiki/docs/en/mathematics/multivariable-calculus/differentation.md

# Differentation

Generalization of derivatives to higher dimensions:

* limit of difference quotient: partial derivatives,
* linearization: total derivative.

## Partial derivatives

*Definition*: let $D \subseteq \mathbb{R}^n$ ($n=2$ for simplicity) and let $f: D \to \mathbb{R}$ and $\mathbf{a} \in D$, if the limit exists the partial derivates of $f$ are

$$
\begin{align*}
    &\partial_1 f(\mathbf{a}) := \lim_{h \to 0} \frac{f(a_1 + h, a_2) - f(\mathbf{a})}{h}, \\
    &\partial_2 f(\mathbf{a}) := \lim_{h \to 0} \frac{f(a_1, a_2 + h) - f(\mathbf{a})}{h}.
\end{align*}
$$

*Theorem*: suppose that two mixed $n$th order partial derivatives of a function $f$ involve the same differentations but in different orders. If those partials are continuous at a point $\mathbf{a}$ and if $f$ and all partials of $f$ of order less than $n$ are continuous in a neighbourhood of $\mathbf{a}$, then the two mixed partials are equal at the point $\mathbf{a}$. We have for $n=2$ 

$$
    \partial_{12} f(P) = \partial_{21} f(P),
$$

??? note "*Proof*:"

    Will be added later.

## Total derivatives

*Definition*: let $D \subseteq \mathbb{R}^n$ ($n=2$ for simplicity) and let $f: D \to \mathbb{R}$, determining an affine linear approximation of $f$ around $\mathbf{a} \in D$

$$
    p(\mathbf{x}) = f(\mathbf{a}) + \big\langle L,\; \mathbf{x} - \mathbf{a} \big\rangle,
$$

with $f(\mathbf{x}) = p(\mathbf{x}) + r(\mathbf{x})$ demand $\frac{r(\mathbf{x})}{\|\mathbf{x} - \mathbf{a}\|} \to 0$ when $\mathbf{x} \to \mathbf{a}$. 

if $L \in \mathbb{R}^2$ exists to satisfy this, then $f$ is called totally differentiable in $\mathbf{a}$. 

*Theorem*: if $f$ is totally differentiable in $\mathbf{a}$, then $f$ is partially differentiable in $\mathbf{a}$ and the partial derivatives are

$$
    \partial_1 f(\mathbf{x}) = L_1, \qquad \partial_2 f(\mathbf{x}) = L_2,
$$

obtaining 

$$
    p(\mathbf{x}) = f(\mathbf{a}) + \big\langle \nabla f(\mathbf{a}),\; \mathbf{x} - \mathbf{a} \big\rangle.
$$

with $\nabla f(\mathbf{a})$ the gradient of $f$.

??? note "*Proof*:"

    Will be added later.

## Chain rule

*Definition*: let $D \subseteq \mathbb{R}^n$ ($n=2$ for simplicity) and let $f: D \to \mathbb{R}$, also let $g: \mathbb{R} \to \mathbb{R}$ given by

$$
    g(t) = f\big(\mathbf{x}(t)\big),
$$

if $f$ is continuously differentiable, then $g$ is differentiable with

$$
    g'(t) = \big\langle \nabla f\big(\mathbf{x}(t)\big),\; \mathbf{\dot x}(t) \big\rangle.
$$

## Gradients

*Definition*: at any point $\mathbf{x} \in D$ where the first partial derivatives of $f$ exist, we define the gradient vector $\nabla$ by

$$
    \nabla f(\mathbf{x}) = \begin{pmatrix} \partial_1 f(\mathbf{x}) \\ \partial_2 f(\mathbf{x}) \end{pmatrix}.
$$

The direction of the gradient is the direction of steepest increase of $f$ at $\mathbf{x}$.

<br>

*Theorem*: gradients are orthogonal to level lines and level surfaces.

??? note "*Proof*:"

    let $\mathbf{r}(t) = \big(x(t),\; y(t) \big)^T$ be a parameterization of the level curve of $f$ such that $\mathbf{r}(0) = \mathbf{a}$. Then for all $t$ near $0$, $f(\mathbf{r}(t)) = f(\mathbf{a})$. Differentiating this equation with respect to $t$ using the chain rule, we obtain

    $$
        \partial_1 f(\mathbf{x}) \dot x(t) + \partial_2 f(\mathbf{x}) \dot y(t) = 0,
    $$

    at $t=0$, we can rewrite this to

    $$
        \big\langle \nabla f(\mathbf{a}),\; \mathbf{\dot r}(0) \big\rangle = 0,
    $$

    obtaining that $\nabla f$ is orthogonal to $\mathbf{\dot r}$. 

## Directional derivatives

*Definition*: let $D \subseteq \mathbb{R}^n$ and let $f: D \to \mathbb{R}$ with $\mathbf{v} \in D$ and $\|\mathbf{v}\| = 1$ a unit vector. The directional derivative is then the change of $f$ near a point $\mathbf{a} \in D$ in the direction of $\mathbf{v}$ 

$$  
    D_\mathbf{v} f(\mathbf{a}) = \big\langle \mathbf{v},\; \nabla f(\mathbf{a}) \big\rangle.
$$

## The general case

*Definition*: let $D \subseteq \mathbb{R}^n$ and let $\mathbf{f}: D \to \mathbb{R}^m$, with $f_i: D \to \mathbb{R}$, with $i = 1, \dotsc, m$ being the components of $\mathbf{f}$. 

* $\mathbf{f}$ is continuous at $\mathbf{a} \in D$ $\iff$ all $f_i$ continuous at $\mathbf{a}$,
* $\mathbf{f}$ is partially/totally differentiable at $\mathbf{a}$ $\iff$ all $f_i$ are partially/totally differentiable at $\mathbf{a}$. 

The linearization of every component $f_i$ we have

$$
    f_i(\mathbf{x}) = f_i(\mathbf{a}) + \big\langle \nabla f_i(\mathbf{a}),\; \mathbf{x} - \mathbf{a} \big\rangle + r_i(\mathbf{x}),
$$

so in total we have

$$
    \mathbf{f}(\mathbf{x}) = \mathbf{f}(\mathbf{a}) + D\mathbf{f}(\mathbf{a}) \big(\mathbf{x} - \mathbf{a}\big) + \mathbf{r}(\mathbf{x}),
$$

with $D\mathbf{f}(\mathbf{a})$ the Jacobian of $\mathbf{f}$.

*Definition*: the Jacobian is given by $\big[D\mathbf{f}(\mathbf{a}) \big]_{i,\;j} = \partial_j f_i(\mathbf{a}).$

### Chain rule

Let $D \subseteq \mathbb{R}^n$ and let $E \subseteq \mathbb{R}^m$ be sets and let $\mathbf{f}: D \to \mathbb{R}^m$ and let $\mathbf{g}: E \to \mathbb{R}^k$ with $\mathbf{f}$ differentiable at $\mathbf{x}$ and $\mathbf{g}$ differentiable at $\mathbf{f}(\mathbf{x})$. Then $D\mathbf{f}(\mathbf{x}) \in \mathbb{R}^{m \times n}$ and $D\mathbf{g}\big(\mathbf{f}(\mathbf{x})\big) \in \mathbb{R}^{k \times m}$. 

Then if we differentiate $\mathbf{g} \circ \mathbf{f}$ we obtain

$$
    D(\mathbf{g} \circ \mathbf{f})(\mathbf{x}) = D\mathbf{g}\big(\mathbf{f}(\mathbf{x})\big) D\mathbf{f}(\mathbf{x}).
$$

We have two interpretations:

* the composition of linear maps,
* the matrix multiplication of the Jacobian.

??? note "*Proof*:"

    Will be added later.
Added portion of multivariable calculus. 2023-10-29 20:02:37 +01:00			`# Differentation`

			`Generalization of derivatives to higher dimensions:`

			`* limit of difference quotient: partial derivatives,`
			`* linearization: total derivative.`

			`## Partial derivatives`

			`Definition: let $D \subseteq \mathbb{R}^n$ ($n=2$ for simplicity) and let $f: D \to \mathbb{R}$ and $\mathbf{a} \in D$, if the limit exists the partial derivates of $f$ are`

			`$$`
			`\begin{align*}`
			`&\partial_1 f(\mathbf{a}) := \lim_{h \to 0} \frac{f(a_1 + h, a_2) - f(\mathbf{a})}{h}, \\`
			`&\partial_2 f(\mathbf{a}) := \lim_{h \to 0} \frac{f(a_1, a_2 + h) - f(\mathbf{a})}{h}.`
			`\end{align*}`
			`$$`

			`Theorem: suppose that two mixed $n$th order partial derivatives of a function $f$ involve the same differentations but in different orders. If those partials are continuous at a point $\mathbf{a}$ and if $f$ and all partials of $f$ of order less than $n$ are continuous in a neighbourhood of $\mathbf{a}$, then the two mixed partials are equal at the point $\mathbf{a}$. We have for $n=2$`

			`$$`
			`\partial_{12} f(P) = \partial_{21} f(P),`
			`$$`

Removed error in proofs. 2024-01-03 13:46:55 +01:00			`??? note "Proof:"`
Improved syntax. 2023-11-02 12:49:38 +01:00
Removed error in proofs. 2024-01-03 13:46:55 +01:00			`Will be added later.`
Added portion of multivariable calculus. 2023-10-29 20:02:37 +01:00
			`## Total derivatives`

			`Definition: let $D \subseteq \mathbb{R}^n$ ($n=2$ for simplicity) and let $f: D \to \mathbb{R}$, determining an affine linear approximation of $f$ around $\mathbf{a} \in D$`

			`$$`
			`p(\mathbf{x}) = f(\mathbf{a}) + \big\langle L,\; \mathbf{x} - \mathbf{a} \big\rangle,`
			`$$`

			`with $f(\mathbf{x}) = p(\mathbf{x}) + r(\mathbf{x})$ demand $\frac{r(\mathbf{x})}{\\|\mathbf{x} - \mathbf{a}\\|} \to 0$ when $\mathbf{x} \to \mathbf{a}$.`

			`if $L \in \mathbb{R}^2$ exists to satisfy this, then $f$ is called totally differentiable in $\mathbf{a}$.`

			`Theorem: if $f$ is totally differentiable in $\mathbf{a}$, then $f$ is partially differentiable in $\mathbf{a}$ and the partial derivatives are`

			`$$`
			`\partial_1 f(\mathbf{x}) = L_1, \qquad \partial_2 f(\mathbf{x}) = L_2,`
			`$$`

			`obtaining`

			`$$`
			`p(\mathbf{x}) = f(\mathbf{a}) + \big\langle \nabla f(\mathbf{a}),\; \mathbf{x} - \mathbf{a} \big\rangle.`
			`$$`

			`with $\nabla f(\mathbf{a})$ the gradient of $f$.`

Removed error in proofs. 2024-01-03 13:46:55 +01:00			`??? note "Proof:"`
Improved syntax. 2023-11-02 12:49:38 +01:00
Removed error in proofs. 2024-01-03 13:46:55 +01:00			`Will be added later.`
Improved syntax. 2023-11-02 12:49:38 +01:00
Added implicit equations. 2023-10-30 16:20:21 +01:00			`## Chain rule`

			`Definition: let $D \subseteq \mathbb{R}^n$ ($n=2$ for simplicity) and let $f: D \to \mathbb{R}$, also let $g: \mathbb{R} \to \mathbb{R}$ given by`

			`$$`
			`g(t) = f\big(\mathbf{x}(t)\big),`
			`$$`

			`if $f$ is continuously differentiable, then $g$ is differentiable with`

			`$$`
			`g'(t) = \big\langle \nabla f\big(\mathbf{x}(t)\big),\; \mathbf{\dot x}(t) \big\rangle.`
			`$$`

Added portion of multivariable calculus. 2023-10-29 20:02:37 +01:00			`## Gradients`

			`Definition: at any point $\mathbf{x} \in D$ where the first partial derivatives of $f$ exist, we define the gradient vector $\nabla$ by`

			`$$`
			`\nabla f(\mathbf{x}) = \begin{pmatrix} \partial_1 f(\mathbf{x}) \\ \partial_2 f(\mathbf{x}) \end{pmatrix}.`
			`$$`

			`The direction of the gradient is the direction of steepest increase of $f$ at $\mathbf{x}$.`

			`<br>`

			`Theorem: gradients are orthogonal to level lines and level surfaces.`

Removed error in proofs. 2024-01-03 13:46:55 +01:00			`??? note "Proof:"`
Improved syntax. 2023-11-02 12:49:38 +01:00
Removed error in proofs. 2024-01-03 13:46:55 +01:00			`let $\mathbf{r}(t) = \big(x(t),\; y(t) \big)^T$ be a parameterization of the level curve of $f$ such that $\mathbf{r}(0) = \mathbf{a}$. Then for all $t$ near $0$, $f(\mathbf{r}(t)) = f(\mathbf{a})$. Differentiating this equation with respect to $t$ using the chain rule, we obtain`
Added portion of multivariable calculus. 2023-10-29 20:02:37 +01:00
Removed error in proofs. 2024-01-03 13:46:55 +01:00			`$$`
			`\partial_1 f(\mathbf{x}) \dot x(t) + \partial_2 f(\mathbf{x}) \dot y(t) = 0,`
			`$$`
Added portion of multivariable calculus. 2023-10-29 20:02:37 +01:00
Removed error in proofs. 2024-01-03 13:46:55 +01:00			`at $t=0$, we can rewrite this to`
Added portion of multivariable calculus. 2023-10-29 20:02:37 +01:00
Removed error in proofs. 2024-01-03 13:46:55 +01:00			`$$`
			`\big\langle \nabla f(\mathbf{a}),\; \mathbf{\dot r}(0) \big\rangle = 0,`
			`$$`
Added portion of multivariable calculus. 2023-10-29 20:02:37 +01:00
Removed error in proofs. 2024-01-03 13:46:55 +01:00			`obtaining that $\nabla f$ is orthogonal to $\mathbf{\dot r}$.`
Added portion of multivariable calculus. 2023-10-29 20:02:37 +01:00
			`## Directional derivatives`

			`Definition: let $D \subseteq \mathbb{R}^n$ and let $f: D \to \mathbb{R}$ with $\mathbf{v} \in D$ and $\\|\mathbf{v}\\| = 1$ a unit vector. The directional derivative is then the change of $f$ near a point $\mathbf{a} \in D$ in the direction of $\mathbf{v}$`

			`$$`
			`D_\mathbf{v} f(\mathbf{a}) = \big\langle \mathbf{v},\; \nabla f(\mathbf{a}) \big\rangle.`
			`$$`

			`## The general case`

			`Definition: let $D \subseteq \mathbb{R}^n$ and let $\mathbf{f}: D \to \mathbb{R}^m$, with $f_i: D \to \mathbb{R}$, with $i = 1, \dotsc, m$ being the components of $\mathbf{f}$.`

			`* $\mathbf{f}$ is continuous at $\mathbf{a} \in D$ $\iff$ all $f_i$ continuous at $\mathbf{a}$,`
			`* $\mathbf{f}$ is partially/totally differentiable at $\mathbf{a}$ $\iff$ all $f_i$ are partially/totally differentiable at $\mathbf{a}$.`

			`The linearization of every component $f_i$ we have`

			`$$`
			`f_i(\mathbf{x}) = f_i(\mathbf{a}) + \big\langle \nabla f_i(\mathbf{a}),\; \mathbf{x} - \mathbf{a} \big\rangle + r_i(\mathbf{x}),`
			`$$`

			`so in total we have`

			`$$`
			`\mathbf{f}(\mathbf{x}) = \mathbf{f}(\mathbf{a}) + D\mathbf{f}(\mathbf{a}) \big(\mathbf{x} - \mathbf{a}\big) + \mathbf{r}(\mathbf{x}),`
			`$$`

			`with $D\mathbf{f}(\mathbf{a})$ the Jacobian of $\mathbf{f}$.`

			`Definition: the Jacobian is given by $\big[D\mathbf{f}(\mathbf{a}) \big]_{i,\;j} = \partial_j f_i(\mathbf{a}).$`

			`### Chain rule`

			`Let $D \subseteq \mathbb{R}^n$ and let $E \subseteq \mathbb{R}^m$ be sets and let $\mathbf{f}: D \to \mathbb{R}^m$ and let $\mathbf{g}: E \to \mathbb{R}^k$ with $\mathbf{f}$ differentiable at $\mathbf{x}$ and $\mathbf{g}$ differentiable at $\mathbf{f}(\mathbf{x})$. Then $D\mathbf{f}(\mathbf{x}) \in \mathbb{R}^{m \times n}$ and $D\mathbf{g}\big(\mathbf{f}(\mathbf{x})\big) \in \mathbb{R}^{k \times m}$.`

			`Then if we differentiate $\mathbf{g} \circ \mathbf{f}$ we obtain`

			`$$`
			`D(\mathbf{g} \circ \mathbf{f})(\mathbf{x}) = D\mathbf{g}\big(\mathbf{f}(\mathbf{x})\big) D\mathbf{f}(\mathbf{x}).`
			`$$`

			`We have two interpretations:`

			`* the composition of linear maps,`
			`* the matrix multiplication of the Jacobian.`

Removed error in proofs. 2024-01-03 13:46:55 +01:00			`??? note "Proof:"`
Improved syntax. 2023-11-02 12:49:38 +01:00
Removed error in proofs. 2024-01-03 13:46:55 +01:00			`Will be added later.`