Tangent Vector on Differentiable Manifold
Buildup1
To define a tangent vector at each point on a differentiable manifold $M$, let’s assume a differentiable curve $\alpha : (-\epsilon , \epsilon) \to M$ is given. We would like to define the derivative $\dfrac{d \alpha}{dt}(0)$ at $t=0$ in $\alpha$ as a tangent vector, like in differential geometry, but since the range of $\alpha$ is $M$ (since it’s not guaranteed to be a metric space), we cannot speak of the derivative of $\alpha$. For this reason, tangent vectors on a manifold are defined as functions, namely operators. If you’ve studied differential geometry, treating vectors as operators should be familiar. See the following explanation.
Let $\mathbf{X} \in T_{p}M$ be a tangent vector at point $p$ of surface $M$, and let $\alpha (t)$ be a curve on $M$. Then $\alpha : (-\epsilon, \epsilon) \to M$ and $\alpha (0) = p$ are satisfied, meaning $\mathbf{X} = \dfrac{d \alpha}{d t} (0)$. Now, let function $f$ be a differentiable function defined in some neighborhood of point $p \in M$ on surface $M$. Then, the directional derivative $\mathbf{X}f$ in the direction of $\mathbf{X}$ is defined as follows:
$$ \mathbf{X} : \mathcal{D} \to \mathbb{R}, \quad \text{where } \mathcal{D} \text{ is set of all differentiable functions near } p $$
$$ \mathbf{X} f := \dfrac{d}{dt_{}} (f \circ \alpha) (0) $$
As shown in the definition above, if there is a fixed tangent vector $\mathbf{X}$, then every time $f$ is given, $\mathbf{X}f$ is determined. Therefore, a tangent vector is treated as an operator itself. The notation like $\mathbf{X}f$ is used because it is viewed from the perspective of an operator. Tangent vectors on a differential manifold are similarly defined as functions that map real space through the composition with some curve $\alpha$ every time a differentiable function $f$ is given on $M$.
Definition
Let’s say $M$ is a $n$-dimensional differentiable manifold. A differentiable function $\alpha : (-\epsilon , \epsilon) \to M$ is called a differentiable curve at $M$. Assuming $\alpha (0)=p\in M$, let’s define the set $\mathcal{D}$ as the set of differentiable functions at $p$.
$$ \mathcal{D} := \left\{ f : M \to \mathbb{R} | \text{functions on } M \text{that are differentiable at } p \right\} $$
Then, the tangent vector $\alpha^{\prime}(0) : \mathcal{D} \to \mathbb{R}$ at $\alpha (0) = p$ is defined as the following function.
$$ \alpha^{\prime} (0) f = \dfrac{d}{dt} (f\circ \alpha)(0),\quad f\in \mathcal{D} $$
The set of all tangent vectors at point $p\in M$ is called the tangent space and is denoted as $T_{p}M$.
Explanation
$f : M \to \mathbb{R}$ and $\alpha : (-\epsilon, \epsilon) \to M$ cannot be differentiated in the classical sense because their domains and ranges are not guaranteed to be metric spaces, but their composition $f \circ \alpha : (-\epsilon, \epsilon) \to \mathbb{R}$ can be differentiated.
Since a tangent vector is determined whenever a differentiable curve $\alpha$ is given, it can be thought that there are as many tangent vectors as there are differentiable curves. Moreover, even if two tangent vectors $\mathbf{X}, \mathbf{Y}$ are determined by two different curves $\alpha$ and $\beta$, if $\mathbf{X}f = \mathbf{Y}f$ holds for all $f \in \mathcal{D}$, then $\mathbf{X}$ and $\mathbf{Y}$ are considered the same tangent vector.
The reason the set of tangent vectors $T_{p}M$ is called a tangent space is that it is actually a $n$-dimensional vector space.
From the theorem introduced below, it is possible to express the function value $\alpha^{\prime}(0)f$ of the tangent vector at point $p$ in terms of any coordinate system $\mathbf{x} : U \to M$ concerning $p$, and this value does not depend on the choice of $\mathbf{x}$.
Example
Consider $T_{p}\mathbb{R}^{3}$. When a differentiable curve $\alpha : (-\epsilon, \epsilon) \to \mathbb{R}^{3}$ is determined, a 3-dimensional vector $\alpha^{\prime}(0) = \mathbf{v} = (v_{1}, v_{2}, v_{3}) \in \mathbb{R}^{3}$ is determined. Therefore, according to the definition, the tangent vector for $f : \mathbb{R}^{3} \to \mathbb{R}$ is as follows:
$$ \mathbf{X}f = \dfrac{d (f\circ \alpha)}{d t}(0) = \sum \limits_{i} \dfrac{\partial f}{\partial x_{i}}\dfrac{d \alpha_{i}}{d t}(0) = \sum\limits_{i} v_{i} \dfrac{\partial f}{\partial x_{i}} $$
This is the same as the directional derivative in Euclidean space.
$$ \mathbf{v}[f] = \nabla _{\mathbf{v}}f = \mathbf{v} \cdot \nabla f = \sum \limits_{i} v_{i} \dfrac{\partial f}{\partial v_{i}} $$
The directional derivative is essentially the same as treating the vector as an operator. Therefore, $\mathbf{X}$ can be considered an element of $\mathbb{R}^{3}$, and the following holds:
$$ T_{p}\mathbb{R}^{3} \approxeq \mathbb{R}^{3} $$
Theorem
Let’s say a differentiable curve $\alpha (0) = p$ and a coordinate system $\mathbf{x} : U \to M$ at point $p$ are given. $(u_{1}, \dots, u_{n})$ are the coordinates of $\mathbb{R}^{n}$,
$$ (x_{1}(p), \dots, x_{n}(p)) = \mathbf{x}^{-1}(p) $$
Then, the following formula holds:
$$ \begin{align*} \alpha ^{\prime} (0) f =&\ \sum \limits_{i=1}^{n}x_{i}^{\prime}(p) \left.\dfrac{\partial (f\circ \mathbf{x})}{\partial u_{i}}\right|_{p} \\ =&\ \sum \limits_{i=1}^{n}x_{i}^{\prime}(\alpha (0)) \left.\dfrac{\partial f}{\partial x_{i}}\right|_{t=0} \end{align*} $$
Here, we simply denote it as $x_{i}^{\prime}(0) = x_{i}^{\prime}(\alpha (0))$. Therefore, $\alpha^{\prime}(0)$ is defined as the following differential operator:
$$ \begin{equation} \alpha ^{\prime} (0) = \sum \limits_{i=1}^{n}x_{i}^{\prime}(0) \left.\dfrac{\partial }{\partial x_{i}}\right|_{t=0} \end{equation} $$
If we express it as coordinate vectors for basis $\left\{ \left.\dfrac{\partial }{\partial x_{i}}\right|_{t=0} \right\}$, it is as follows:
$$ \alpha ^{\prime} (0) = \begin{bmatrix} x_{1}^{\prime}(0) \\ \vdots \\ x_{n}^{\prime}(0) \end{bmatrix} $$
Proof
Choose a coordinate system $\mathbf{x} : U \subset \mathbb{R}^{n} \to M$ such that $p = \mathbf{x}(0)$ is satisfied. Consider $f\circ \alpha = f \circ \mathbf{x} \circ \mathbf{x}^{-1} \circ \alpha$ to express the tangent vector in terms of the coordinate system. Then, since $\mathbf{x} \circ \mathbf{x}^{-1} = I$ is an identity function, any choice of coordinate system is irrelevant. Now, think of $f \circ \mathbf{x}$ and $\mathbf{x}^{-1} \circ \alpha$ as one function each, and consider $f \circ \alpha$ as their composite function.
$$ f \circ \alpha = (f \circ \mathbf{x}) \circ (\mathbf{x}^{-1} \circ \alpha) $$
First, consider $f \circ \mathbf{x}$. Since $f \circ \mathbf{x} : \mathbb{R}^{n} \to \mathbb{R}$, it can be expressed as follows and differentiated in the classical sense:
$$ f \circ \mathbf{x} = f \circ \mathbf{x} (u) = f \circ \mathbf{x} (u_{1}, u_{2}, \dots, u_{n}),\quad u=(u_{1},\dots,u_{n}) \in \mathbb{R}^{n} $$
$\mathbf{x}^{-1} \circ \alpha$ can also be expressed as follows since $\mathbf{x}^{-1} \circ \alpha : \mathbb{R} \to \mathbb{R}^{n}$, and it can be differentiated in the classical sense:
$$ \begin{align*} \mathbf{x}^{-1} \circ \alpha (t) =&\ (x_{1}(\alpha (t)), x_{2}(\alpha (t)), \dots, x_{n}(\alpha (t))) \\ =&\ (x_{1}(t), x_{2}(t), \dots, x_{n}(t)) \end{align*} $$
Note that $x_{i}$ is a function of $x_{i} : M \to \mathbb{R}$, and $x_{i}(t)$ is a simplified notation of $x_{i}(\alpha (t))$.
Thinking this way, $f \circ \alpha$ is a composition of two functions, mapped as $\mathbb{R} \overset{\mathbf{x}^{-1} \circ \alpha}{\longrightarrow} \mathbb{R}^{n} \overset{f\circ \mathbf{x}}{\longrightarrow} \mathbb{R}$. Therefore, by the chain rule, the following holds:
$$ \dfrac{d}{d t}(f \circ \alpha) = \dfrac{d}{dt} \left( (f\circ \mathbf{x}) \circ (\mathbf{x}^{-1} \circ \alpha) \right) = \sum \limits_{i=1}^{n}\dfrac{\partial (f\circ \mathbf{x})}{\partial u_{i}} \dfrac{d (\mathbf{x}^{-1} \circ \alpha )_{i}}{d t} = \sum \limits_{i=1}^{n}\dfrac{\partial (f\circ \mathbf{x})}{\partial u_{i}} \dfrac{d x_{i}}{d t} $$
Thus, we obtain the following:
$$ \begin{align*} \alpha^{\prime}(0) f :=&\ \dfrac{d}{dt} (f\circ \alpha)(0) \\ =&\ \sum \limits_{i=1}^{n} \left.\dfrac{\partial (f\circ \mathbf{x})}{\partial u_{i}}\right|_{t=0} \dfrac{d x_{i}}{d t}(0) \\ =&\ \sum \limits_{i=1}^{n} \left.\dfrac{\partial (f\circ \mathbf{x})}{\partial u_{i}}\right|_{t=0} x_{i}^{\prime}(0) \\ =&\ \sum \limits_{i=1}^{n} x_{i}^{\prime}(0) \left.\dfrac{\partial (f\circ \mathbf{x})}{\partial u_{i}}\right|_{t=0} \end{align*} $$
Here, let’s define $\left.\dfrac{\partial }{\partial x_{i}}\right|_{t=0}$ as the following operator:
$$ \left.\dfrac{\partial }{\partial x_{i}}\right|_{t=0} f := \left.\dfrac{\partial (f\circ \mathbf{x})}{\partial u_{i}}\right|_{t=0} $$
Summarizing the meaning of $\dfrac{\partial f}{\partial x_{i}}$:
$f$ cannot be differentiated since its domain is $M$. Therefore, consider the composition with coordinate system $\mathbf{x} : \mathbb{R}^{n} \to M$. This maps $\mathbb{R}^{n}$ to $\mathbb{R}$, thus can be differentiated in the classical sense. Therefore, $\dfrac{\partial f}{\partial x_{i}}$ is defined as differentiating after composing $f$ with $\mathbf{x}$ in Euclidean space $\mathbb{R}^{n}$ at the $u_{i}$-th variable.
Finally, we obtain the following:
$$ \begin{align*} \alpha^{\prime}(0) f =&\ \sum \limits_{i=1}^{n} x_{i}^{\prime}(0) \left.\dfrac{\partial (f\circ \mathbf{x})}{\partial u_{i}}\right|_{t=0} \\ =&\ \sum \limits_{i=1}^{n} x_{i}^{\prime}(0) \left.\dfrac{\partial }{\partial x_{i}}\right|_{t=0}f = \ \sum \limits_{i=1}^{n} x_{i}^{\prime}(0) \left.\dfrac{\partial f}{\partial x_{i}}\right|_{t=0} \end{align*} $$
$$ \implies \alpha^{\prime}(0) = \sum \limits_{i=1}^{n} x_{i}^{\prime}(0) \left.\dfrac{\partial }{\partial x_{i}}\right|_{t=0} $$
■
See Also
Manfredo P. Do Carmo, Riemannian Geometry (Eng Edition, 1992), p6-8 ↩︎