The Covariant Derivative

Our goal for this Chapter is clear. Having discovered that the derivative of a tensor is not a tensor in its own right, we must look for a way to bring differentiation back within the fold of the tensor paradigm. The solution will come in the form of a new differential operator, known as the covariant derivative, whose defining characteristic is its tensor property, i.e. the fact that it produces tensor outputs for tensor inputs. In the end, we will come to see the loss of the tensor property under differentiation not as a setback but as a welcome opportunity to broaden our analytical network.
As we discovered at the end of the last Chapter, if TiT^{i} is a contravariant tensor, i.e.
Ti=TiJii,(15.1)T^{i^{\prime}}=T^{i}J_{i}^{i^{\prime}},\tag{15.1}
then the variant
TiZj(15.2)\frac{\partial T^{i}}{\partial Z^{j}}\tag{15.2}
is not a tensor in its own right since it transforms according to the identity
TiZj=TiZjJiiJjj+TiJijiJjj.(14.88)\frac{\partial T^{i^{\prime}}}{\partial Z^{j^{\prime}}}=\frac{\partial T^{i} }{\partial Z^{j}}J_{i}^{i^{\prime}}J_{j^{\prime}}^{j}+T^{i}J_{ij}^{i^{\prime} }J_{j^{\prime}}^{j}. \tag{14.88}
As a result, we do not easily see how the variant Ti/Zj\partial T^{i}/\partial Z^{j} can be used in an expression that would produce the same value in all coordinate systems and would therefore be considered an invariant. In other words, our analysis has lost its grasp on the geometric meaning.
In order to restore the lost geometric meaning, we must reconnect our analysis to tangible geometric objects. Let us imagine that TiT^{i} is not some abstract tensor but is, in fact, the components of an invariant vector T\mathbf{T}, and that the whole purpose of differentiating TiT^{i} with respect to the coordinates ZjZ^{j} is to capture the rate of change of the vector T\mathbf{T}. In fact, let us switch from the letter TT to the letter UU which we have commonly used to denote geometric vectors in a Euclidean space. Once our present exploration suggests to us a way of retaining the geometric meaning in the course of our analysis, we will switch back from UU to TT and extend our analytical insight to abstract tensors.
Consider an invariant vector field U\mathbf{U} with components UiU^{i}, i.e.
U=UiZi.(15.3)\mathbf{U}=U^{i}\mathbf{Z}_{i}.\tag{15.3}
Recall, that both UiU^{i} and Zi\mathbf{Z}_{i} are tensors. Furthermore, the collection of partial derivatives
UZk,(15.4)\frac{\partial\mathbf{U}}{\partial Z^{k}},\tag{15.4}
being the result of differentiating an invariant, is a tensor in its own right (with vector elements). Since
UZk=(UiZi)Zk,(15.5)\frac{\partial\mathbf{U}}{\partial Z^{k}}=\frac{\partial\left( U^{i} \mathbf{Z}_{i}\right) }{\partial Z^{k}},\tag{15.5}
we should expect that analyzing the expression
(UiZi)Zk(15.6)\frac{\partial\left( U^{i}\mathbf{Z}_{i}\right) }{\partial Z^{k}}\tag{15.6}
should present us with the insights we need for preserving the geometric meaning. (Note that we are differentiating with respect to ZkZ^{k} rather than ZjZ^{j} because we would like to save the index jj for our upcoming analysis of second-order tensors.)
By the product rule, we have
UZk=UiZkZi+UiZiZk.(15.7)\frac{\partial\mathbf{U}}{\partial Z^{k}}=\frac{\partial U^{i}}{\partial Z^{k}}\mathbf{Z}_{i}+U^{i}\frac{\partial\mathbf{Z}_{i}}{\partial Z^{k}}.\tag{15.7}
Recall that, by the very definition of the Christoffel symbol,
ZiZk=ΓikmZm.(12.20)\frac{\partial\mathbf{Z}_{i}}{\partial Z^{k}}=\Gamma_{ik}^{m}\mathbf{Z}_{m}. \tag{12.20}
Therefore,
UZk=UiZkZi+UiΓikmZm.(15.8)\frac{\partial\mathbf{U}}{\partial Z^{k}}=\frac{\partial U^{i}}{\partial Z^{k}}\mathbf{Z}_{i}+U^{i}\Gamma_{ik}^{m}\mathbf{Z}_{m}.\tag{15.8}
Importantly, neither term on the right is a tensor -- but their sum is! Thus, we know to keep the two terms together.
Both terms represent linear combinations with respect to one and the same basis. We should, therefore, be able to combine them in a component-by-component manner. To do so while staying in the tensor notation, re-index the second term so that the basis appears with the index ii. This can be accomplished by switching the roles of indices ii and mm,
UZk=UiZkZi+UmΓmkiZi,(15.9)\frac{\partial\mathbf{U}}{\partial Z^{k}}=\frac{\partial U^{i}}{\partial Z^{k}}\mathbf{Z}_{i}+U^{m}\Gamma_{mk}^{i}\mathbf{Z}_{i},\tag{15.9}
which is valid since both indices are dummy. As a result, we are able to factor out Zi\mathbf{Z}_{i}, i.e.
UZk=(UiZk+UmΓmki)Zi.(15.10)\frac{\partial\mathbf{U}}{\partial Z^{k}}=\left( \frac{\partial U^{i} }{\partial Z^{k}}+U^{m}\Gamma_{mk}^{i}\right) \mathbf{Z}_{i}.\tag{15.10}
Since the Christoffel symbol is symmetric in its subscripts, i.e.
Γmki=Γkmi,(12.21)\Gamma_{mk}^{i}=\Gamma_{km}^{i}, \tag{12.21}
we arrive at the identity
UZk=(UiZk+ΓkmiUm)Zi.(15.11)\frac{\partial\mathbf{U}}{\partial Z^{k}}=\left( \frac{\partial U^{i} }{\partial Z^{k}}+\Gamma_{km}^{i}U^{m}\right) \mathbf{Z}_{i}.\tag{15.11}
This identity brings us right to the main point: the combination
UiZk+ΓkmiUm,(15.12)\frac{\partial U^{i}}{\partial Z^{k}}+\Gamma_{km}^{i}U^{m},\tag{15.12}
being the components of a tensor, is a tensor in its own right. This alone should signal to us that this combination ought to take the place of the partial derivative
UjZi(15.13)\frac{\partial U^{j}}{\partial Z^{i}}\tag{15.13}
as the differential operator for measuring spatial variability. Besides being a tensor, the combination
UiZk+ΓkmiUm(15.12)\frac{\partial U^{i}}{\partial Z^{k}}+\Gamma_{km}^{i}U^{m} \tag{15.12}
is decidedly preferable over Ui/Zk\partial U^{i}/\partial Z^{k} alone since it actually represents the components of the derivatives of the vector field U\mathbf{U}. And it is able to do so by referencing only those objects that are available in the coordinate space. This is possible thanks to the term containing the Christoffel symbol which can be said to account for the variability of the accompanying basis.
Let us now explore the same analytical argument for the covariant component UjU_{j} of U\mathbf{U}. Differentiate the identity
U=UjZj(15.14)\mathbf{U}=U_{j}\mathbf{Z}^{j}\tag{15.14}
with respect to ZkZ^{k}, i.e.
UZk=(UjZj)Zk.(15.15)\frac{\partial\mathbf{U}}{\partial Z^{k}}=\frac{\partial\left( U_{j} \mathbf{Z}^{j}\right) }{\partial Z^{k}}.\tag{15.15}
By the product rule, we have
UZk=UjZkZj+UjZjZk.(15.16)\frac{\partial\mathbf{U}}{\partial Z^{k}}=\frac{\partial U_{j}}{\partial Z^{k}}\mathbf{Z}^{j}+U_{j}\frac{\partial\mathbf{Z}^{j}}{\partial Z^{k}}.\tag{15.16}
Since
ZjZk=ΓkmjZm,(12.28)\frac{\partial\mathbf{Z}^{j}}{\partial Z^{k}}=-\Gamma_{km}^{j}\mathbf{Z}^{m}, \tag{12.28}
we find that
UZk=UjZkZjUjΓkmjZm.(15.17)\frac{\partial\mathbf{U}}{\partial Z^{k}}=\frac{\partial U_{j}}{\partial Z^{k}}\mathbf{Z}^{j}-U_{j}\Gamma_{km}^{j}\mathbf{Z}^{m}.\tag{15.17}
Renaming the indices and rearranging the terms as before yields
UZk=(UjZkΓjkmUm)Zj.(15.18)\frac{\partial\mathbf{U}}{\partial Z^{k}}=\left( \frac{\partial U_{j} }{\partial Z^{k}}-\Gamma_{jk}^{m}U_{m}\right) \mathbf{Z}^{j}.\tag{15.18}
Thus, we conclude that the combination
UjZkΓjkmUm.(15.19)\frac{\partial U_{j}}{\partial Z^{k}}-\Gamma_{jk}^{m}U_{m}.\tag{15.19}
is a tensor in its own right and therefore ought to replace the partial derivative as the primary differential operator. In the next Section, we will use the combinations
UiZk+ΓkmiUm(15.12)\frac{\partial U^{i}}{\partial Z^{k}}+\Gamma_{km}^{i}U^{m} \tag{15.12}
and
UjZkΓjkmUm(15.19)\frac{\partial U_{j}}{\partial Z^{k}}-\Gamma_{jk}^{m}U_{m} \tag{15.19}
as the inspiration for the definition of the covariant derivative for first-order variants.
Since we are turning our attention to general tensors which may not be the components of a vector field, let us switch from the letter UU back to TT. For a first-order contravariant tensor TiT^{i} and for a first-order covariant tensor TjT_{j}, the definitions of the covariant derivative read
kTi=TiZk+ΓkmiTm          (15.20)kTj=TjZkΓjkmTm.          (15.21)\begin{aligned}\nabla_{k}T^{i} & =\frac{\partial T^{i}}{\partial Z^{k}}+\Gamma_{km} ^{i}T^{m}\ \ \ \ \ \ \ \ \ \ \left(15.20\right)\\\nabla_{k}T_{j} & =\frac{\partial T_{j}}{\partial Z^{k}}-\Gamma_{jk} ^{m}T_{m}.\ \ \ \ \ \ \ \ \ \ \left(15.21\right)\end{aligned}
The cornerstone property of the covariant derivative, which we will discuss last, is that it produces tensor outputs for tensor inputs.
It is a signature characteristic of the covariant derivative that the different flavors of tensors receive different treatments. At first sight, the fact that there are two different definitions for the two different types of tensors may create a certain sense of inelegance and perhaps of excessive complexity. However, we should point out that the strong indicial structure of these definitions makes it easy to correctly arrange the indices in the Christoffel term. Indeed, note that the indicial signature of
TiZk   is   ki(15.22)\frac{\partial T^{i}}{\partial Z^{k}}\ \ \text{ is\ \ \ }_{k}^{i}\tag{15.22}
while that of
TjZk   is   jk   .(15.23)\frac{\partial T_{j}}{\partial Z^{k}}\ \ \text{ is \ \ }_{jk}^{{}}\ \ \ .\tag{15.23}
Thus,
TiZk starts with Γk_i    while    TjZk starts with Γjk_,(15.24)\frac{\partial T^{i}}{\partial Z^{k}}\text{ starts with }\Gamma_{k\_} ^{i}\text{\ \ \ \ while\ \ \ \ }\frac{\partial T_{j}}{\partial Z^{k}}\text{ starts with }\Gamma_{jk}^{\_},\tag{15.24}
where in the last Christoffel symbol, the order of the subscripts does not matter due to symmetry. Since two of the three indices have "placed themselves", it leaves only one possibility for the remaining index mm, i.e.
TiZk ends up with ΓkmiTm    and    TjZk ends up with ΓjkmTm.(15.25)\frac{\partial T^{i}}{\partial Z^{k}}\text{ ends up with }\Gamma_{km}^{i} T^{m}\text{\ \ \ \ and\ \ \ \ }\frac{\partial T_{j}}{\partial Z^{k}}\text{ ends up with }\Gamma_{jk}^{m}T_{m}.\tag{15.25}
Thus, the only aspect of the definition of the covariant derivative that needs to be memorized is the sign: plus for a contravariant tensor and minus for a covariant tensor.
Note another interesting characteristic of the covariant derivative: it cannot be applied to the elements of the input vector individually. For example, kT1\nabla_{k}T^{1} cannot be evaluated without referring to all of the remaining elements of TiT^{i} since those appear in the term with the Christoffel symbol. This is unlike the partial derivative where T1/Zk\partial T^{1}/\partial Z^{k} can be evaluated individually, without a reference to the other elements of TiT^{i}.
The definitions presented above are valid only for first-order tensors. Before we give the general definition of the covariant derivative applicable to tensors of arbitrary order, we will explore one crucial property, known as the metrinilic property, of the first-order definitions presented so far.
The definitions of the covariant derivative -- as applied to first-order tensors -- that we have just formulated in the previous Section can be applied to the covariant basis Zi\mathbf{Z}_{i} as well as the contravariant basis Zi\mathbf{Z}^{i}. In both cases, the result is as surprising as it is important.
Let us begin with the covariant basis Zi\mathbf{Z}_{i}. By the definition
kTj=TjZkΓjkmTm(15.21)\nabla_{k}T_{j}=\frac{\partial T_{j}}{\partial Z^{k}}-\Gamma_{jk}^{m}T_{m} \tag{15.21}
of the covariant derivative for a covariant tensor, kZi\nabla_{k}\mathbf{Z}_{i} is given by
kZi=ZiZkΓikmZm.(15.26)\nabla_{k}\mathbf{Z}_{i}=\frac{\partial\mathbf{Z}_{i}}{\partial Z^{k}} -\Gamma_{ik}^{m}\mathbf{Z}_{m}.\tag{15.26}
However, by the very definition of the Christoffel symbol,
ZiZk=ΓikmZm,(12.20)\frac{\partial\mathbf{Z}_{i}}{\partial Z^{k}}=\Gamma_{ik}^{m}\mathbf{Z}_{m}, \tag{12.20}
and therefore the two terms on the right of the preceding equation cancel each other. Thus, we conclude that
kZi=0,(15.27)\nabla_{k}\mathbf{Z}_{i}=\mathbf{0,}\tag{15.27}
i.e. the covariant derivative of the covariant basis vanishes.
The very same conclusion awaits us for the contravariant basis Zi\mathbf{Z} ^{i}. By the definition
kTi=TiZk+ΓkmiTm(15.21)\nabla_{k}T^{i}=\frac{\partial T^{i}}{\partial Z^{k}}+\Gamma_{km}^{i}T^{m} \tag{15.21}
of the covariant derivative for a contravariant tensor, kZj\nabla_{k} \mathbf{Z}^{j} is given by
kZi=ZiZk+ΓkmiZm(15.28)\nabla_{k}\mathbf{Z}^{i}=\frac{\partial\mathbf{Z}^{i}}{\partial Z^{k}} +\Gamma_{km}^{i}\mathbf{Z}^{m}\tag{15.28}
and, since
ZiZk=ΓkmiZm,(12.28)\frac{\partial\mathbf{Z}^{i}}{\partial Z^{k}}=-\Gamma_{km}^{i}\mathbf{Z}^{m}, \tag{12.28}
we can conclude that
kZi=0,(15.29)\nabla_{k}\mathbf{Z}^{i}=\mathbf{0,}\tag{15.29}
i.e. the covariant derivative of the contravariant basis vanishes.
This characteristic of the covariant derivative is referred to as the metrinilic property -- metrinilic being the combination of the words metric and nil. We will soon discover that the metrinilic property extends to the metric tensors, as well as to the soon-to-be-introduced Levi-Civita symbols εijk\varepsilon^{ijk} and εijk\varepsilon_{ijk} also considered parts of the metrics family.
The metrinilic property of the covariant derivative has far reaching implications. Recall that the partial derivative /Zk\partial/\partial Z^{k} produces zero when applied to the coordinate basis vectors i\mathbf{i}, j\mathbf{j}, and k\mathbf{k} in affine coordinates, i.e.
iZk=jZk=kZk=0.(15.30)\frac{\partial\mathbf{i}}{\partial Z^{k}}=\frac{\partial\mathbf{j}}{\partial Z^{k}}=\frac{\partial\mathbf{k}}{\partial Z^{k}}=\mathbf{0.}\tag{15.30}
This property is the key to our ability to differentiate vector fields by differentiating their affine components. Indeed, for a vector field F(Z)\mathbf{F}\left( Z\right) with components F1(Z)F^{1}\left( Z\right) , F2(Z)F^{2}\left( Z\right) , and F3(Z)F^{3}\left( Z\right) , i.e.
F(Z)=F1(Z)i+F2(Z)j+F3(Z)k,(15.31)\mathbf{F}\left( Z\right) =F^{1}\left( Z\right) \mathbf{i}+F^{2}\left( Z\right) \mathbf{j}+F^{3}\left( Z\right) \mathbf{k,}\tag{15.31}
we have, by the product rule,
FZk=F1Zki+F1iZk    +    analogous terms for       the other components.(15.32)\frac{\partial\mathbf{F}}{\partial Z^{k}}=\frac{\partial F^{1}}{\partial Z^{k}}\mathbf{i}+F^{1}\frac{\partial\mathbf{i}}{\partial Z^{k}} \ \ \ \ +\ \ \text{ \ } \begin{array} {l} \text{analogous terms for}\\ \ \ \ \ \ \ \ \text{the other components.} \end{array}\tag{15.32}
Since the partial derivatives of the basis vectors vanish, we are left with
FZk=F1Zki+F2Zkj+F3Zkk.(15.33)\frac{\partial\mathbf{F}}{\partial Z^{k}}=\frac{\partial F^{1}}{\partial Z^{k}}\mathbf{i}+\frac{\partial F^{2}}{\partial Z^{k}}\mathbf{j} +\frac{\partial F^{3}}{\partial Z^{k}}\mathbf{k}.\tag{15.33}
Thus, the components of the derivative F/Zk\partial \mathbf{F}/\partial Z^{k} are
F1ZkF2Zk, and F3Zk,(15.34)\frac{\partial F^{1}}{\partial Z^{k}}\text{, }\frac{\partial F^{2}}{\partial Z^{k}}\text{, and }\frac{\partial F^{3}}{\partial Z^{k}},\tag{15.34}
i.e. the derivatives of the corresponding components. In informal words, F\mathbf{F} can be differentiated by differentiating its components.
Thanks to the metrinilic property, the same principle applies to the covariant derivative. If
F(Z)=F1(Z)Z1(Z)+F2(Z)Z2(Z)+F3(Z)Z3(Z),(15.35)\mathbf{F}\left( Z\right) =F^{1}\left( Z\right) \mathbf{Z}_{1}\left( Z\right) +F^{2}\left( Z\right) \mathbf{Z}_{2}\left( Z\right) +F^{3}\left( Z\right) \mathbf{Z}_{3}\left( Z\right) ,\tag{15.35}
or, omitting the independent variables,
F=F1Z1+F2Z2+F3Z3,(15.36)\mathbf{F}=F^{1}\mathbf{Z}_{1}+F^{2}\mathbf{Z}_{2}+F^{3}\mathbf{Z}_{3},\tag{15.36}
then, by an application of the product rule (to be demonstrated later), we have
kF=kF1 Z1+F1 kZ1    +    analogous terms for       the other components.(15.37)\nabla_{k}\mathbf{F}=\nabla_{k}F^{1}\ \mathbf{Z}_{1}+F^{1}\ \nabla _{k}\mathbf{Z}_{1}\ \ \ \ +\ \text{\ \ \ } \begin{array} {l} \text{analogous terms for}\\ \ \ \ \ \ \ \ \text{the other components.} \end{array}\tag{15.37}
By the metrinilic property, we have
kF=kF1 Z1+kF2 Z2+kF3 Z3.(15.38)\nabla_{k}\mathbf{F}=\nabla_{k}F^{1}~\mathbf{Z}_{1}+\nabla_{k}F^{2} ~\mathbf{Z}_{2}+\nabla_{k}F^{3}~\mathbf{Z}_{3}.\tag{15.38}
So, analogously to the partial derivative, the covariant derivative can be applied in component-wise fashion.
Finally, let us express the same calculation in indicial form. If
F=FiZi(15.39)\mathbf{F}=F^{i}\mathbf{Z}_{i}\tag{15.39}
then, by taking the covariant derivatives of both sides of this equation, i.e.
kF=k(FiZi),(15.40)\nabla_{k}\mathbf{F}=\nabla_{k}\left( F^{i}\mathbf{Z}_{i}\right) ,\tag{15.40}
we find, by the product rule, that
kF=kFi Zi+FikZi.(15.41)\nabla_{k}\mathbf{F}=\nabla_{k}F^{i}\ \mathbf{Z}_{i}+F^{i}\nabla_{k} \mathbf{Z}_{i}.\tag{15.41}
By the metrinilic property, the second term vanishes and we are left with
kF=kFi Zi.(15.42)\nabla_{k}\mathbf{F}=\nabla_{k}F^{i}\ \mathbf{Z}_{i}.\tag{15.42}
On an intuitive level, the effect of the metrinilic property may be described by saying that the covariant derivative sees the basis as a constant and thus lets it pass through, similar to the way the ordinary derivative allows constants to pass through, e.g.
ddx(cf)=cdfdx.(15.43)\frac{d}{dx}\left( cf\right) =c\frac{df}{dx}.\tag{15.43}
Similarly,
k(FiZi)=ZikFi.(15.44)\nabla_{k}\left( F^{i}\mathbf{Z}_{i}\right) =\mathbf{Z}_{i}\nabla_{k}F^{i}.\tag{15.44}
Rather than outright stating the definition of the covariant derivative for variants of arbitrary order, we will show how it inevitably arises from the combination of the metrinilic property and the product rule. To this end, consider a second-order tensor TjiT_{j}^{i}. The definitions of the covariant derivative k\nabla_{k} available to us so far apply only to variants of order one and therefore do not apply to TjiT_{j}^{i}. However, if we contract TjiT_{j}^{i} with Zj\mathbf{Z}^{j}, the resulting tensor
Ti=TjiZj(15.45)\mathbf{T}^{i}=T_{j}^{i}\mathbf{Z}^{j}\tag{15.45}
is of order one and is therefore subject to the definition
kTi=TiZk+ΓkmiTm.(15.20)\nabla_{k}T^{i}=\frac{\partial T^{i}}{\partial Z^{k}}+\Gamma_{km}^{i}T^{m}. \tag{15.20}
According to this definition, we have
kTi=TiZk+ΓkmiTm.(15.46)\nabla_{k}\mathbf{T}^{i}=\frac{\partial\mathbf{T}^{i}}{\partial Z^{k}} +\Gamma_{km}^{i}\mathbf{T}^{m}.\tag{15.46}
Substituting TjiZjT_{j}^{i}\mathbf{Z}^{j} for Ti\mathbf{T}^{i} and TjmZjT_{j} ^{m}\mathbf{Z}^{j} for Tm\mathbf{T}^{m} and applying the ordinary product rule to the partial derivative, we find
kTi=TjiZkZj+TjiZjZk+ΓkmiTjmZj.(15.47)\nabla_{k}\mathbf{T}^{i}=\frac{\partial T_{j}^{i}}{\partial Z^{k}} \mathbf{Z}^{j}+T_{j}^{i}\frac{\partial\mathbf{Z}^{j}}{\partial Z^{k}} +\Gamma_{km}^{i}T_{j}^{m}\mathbf{Z}^{j}.\tag{15.47}
Since
ZjZk=ΓkmjZm,(12.28)\frac{\partial\mathbf{Z}^{j}}{\partial Z^{k}}=-\Gamma_{km}^{j}\mathbf{Z}^{m}, \tag{12.28}
we have
kTi=TjiZkZjTjiΓkmjZm+ΓkmiTjmZj.(15.48)\nabla_{k}\mathbf{T}^{i}=\frac{\partial T_{j}^{i}}{\partial Z^{k}} \mathbf{Z}^{j}-T_{j}^{i}\Gamma_{km}^{j}\mathbf{Z}^{m}+\Gamma_{km}^{i}T_{j} ^{m}\mathbf{Z}^{j}.\tag{15.48}
Finally, switching the roles of jj and mm in the second term, we find
kTi=(TjiZkΓkjmTmi+ΓkmiTjm)Zj.(15.49)\nabla_{k}\mathbf{T}^{i}=\left( \frac{\partial T_{j}^{i}}{\partial Z^{k} }-\Gamma_{kj}^{m}T_{m}^{i}+\Gamma_{km}^{i}T_{j}^{m}\right) \mathbf{Z}^{j}.\tag{15.49}
This completes our analysis of the first-order tensor Ti\mathbf{T}^{i} under the already-established covariant derivative for first-order variants.
Now let us turn our attention to the yet-to-be-defined covariant derivative k\nabla_{k} applicable to second- and higher-order tensors that could therefore be applied to the equivalent combination TjiZjT_{j}^{i}\mathbf{Z}^{j}. If we desire that it satisfies the product rule and the metrinilic property, then its application to the identity
Ti=TjiZj(15.50)\mathbf{T}^{i}=T_{j}^{i}\mathbf{Z}^{j}\tag{15.50}
will yield
kTi=kTji Zj,(15.51)\nabla_{k}\mathbf{T}^{i}=\nabla_{k}T_{j}^{i}~\mathbf{Z}^{j},\tag{15.51}
Comparing the two expressions for kTi\nabla_{k}\mathbf{T}^{i} we conclude that the only viable definition for kTji\nabla_{k}T_{j}^{i} is
kTji=TjiZkΓkjmTmi+ΓkmiTjm.(15.52)\nabla_{k}T_{j}^{i}=\frac{\partial T_{j}^{i}}{\partial Z^{k}}-\Gamma_{kj} ^{m}T_{m}^{i}+\Gamma_{km}^{i}T_{j}^{m}.\tag{15.52}
This identity will indeed serve as a blueprint for the general definition of the covariant derivative. The structure of the expression on the right can be summarized in words as follows: there is a Christoffel term for each index in the indicial signature of the variant. In each term, the relevant index participates in the indicial pattern that was established for first-order variants, while the remaining indices remain where they are.
The number of terms in the definition of the covariant derivative depends on the order of the variant. Therefore, we will, as we typically do, capture the general definition by presenting an expression for a variant TjiT_{j}^{i} with a representative collection of indices. The definition reads
kTji=TjiZkΓkjmTmi+ΓkmiTjm.(15.53)\nabla_{k}T_{j}^{i}=\frac{\partial T_{j}^{i}}{\partial Z^{k}}-\Gamma_{kj} ^{m}T_{m}^{i}+\Gamma_{km}^{i}T_{j}^{m}.\tag{15.53}
It is to be understood in the sense that to each index in the indicial signature of the variant under the derivative, there corresponds an appropriate additive term involving the Christoffel symbol, where the relevant index participates in the kind of indicial pattern that was established for first-order variants, while the remaining indices remain where they are.
For example, when applied to the contravariant tensor TrstT^{rst}, the formula reads
kTrst=TrstZk+ΓkmrTmst+ΓkmsTrmt+ΓkmtTrsm.(15.54)\nabla_{k}T^{rst}=\frac{\partial T^{rst}}{\partial Z^{k}}+\Gamma_{km} ^{r}T^{mst}+\Gamma_{km}^{s}T^{rmt}+\Gamma_{km}^{t}T^{rsm}.\tag{15.54}
For a covariant tensor TrstT_{rst}, we have
kTrst=TrstZkΓkrmTmstΓksmTrmtΓktmTrsm.(15.55)\nabla_{k}T_{rst}=\frac{\partial T_{rst}}{\partial Z^{k}}-\Gamma_{kr} ^{m}T_{mst}-\Gamma_{ks}^{m}T_{rmt}-\Gamma_{kt}^{m}T_{rsm}.\tag{15.55}
Finally, for a tensor TuvrsT_{uv}^{rs}, we have
kTuvrs=TuvrsZk+ΓkmrTuvms+ΓkmsTuvrmΓkumTmvrsΓkvmTumrs.(15.56)\nabla_{k}T_{uv}^{rs}=\frac{\partial T_{uv}^{rs}}{\partial Z^{k}}+\Gamma _{km}^{r}T_{uv}^{ms}+\Gamma_{km}^{s}T_{uv}^{rm}-\Gamma_{ku}^{m}T_{mv} ^{rs}-\Gamma_{kv}^{m}T_{um}^{rs}.\tag{15.56}
We are, once again, obliged to state the most crucial property, known as the tensor property, of the covariant derivative -- namely, that it produces tensor outputs for tensor inputs. More specifically, the resulting tensor is of one covariant order greater than the input tensor.
The newly created index can, of course, be raised by contraction with the contravariant metric tensor. The resulting operator, represented by the combination
Zlkk,(15.57)Z^{lk}\nabla_{k},\tag{15.57}
is aptly denoted by the symbol l\nabla^{l}, i.e.
l=Zlkk(15.58)\nabla^{l}=Z^{lk}\nabla_{k}\tag{15.58}
and can be referred to as the contravariant derivative, although this term is not frequently used.
Finally, note that the covariant derivative can be applied to variants that are not tensors. For example,
kΓrsi=ΓrsiZk+ΓkmiΓrsmΓkrmΓmsiΓksmΓrmi(15.59)\nabla_{k}\Gamma_{rs}^{i}=\frac{\partial\Gamma_{rs}^{i}}{\partial Z^{k} }+\Gamma_{km}^{i}\Gamma_{rs}^{m}-\Gamma_{kr}^{m}\Gamma_{ms}^{i}-\Gamma _{ks}^{m}\Gamma_{rm}^{i}\tag{15.59}
In this case, it is likely that the output variant is also not a tensor.
A tensor of order zero is an invariant. As we pointed out in the previous Chapter, the fact that the general definition of a tensor applies to variants of order zero is not a minor edge case but is, in fact, the heart of the matter.
Similarly, the general definition of the covariant derivative applies to variants of order zero. Since a variant TT of order zero has no indices, the covariant derivative reduces to the ordinary partial derivative, i.e.
kT=TZk.(15.60)\nabla_{k}T=\frac{\partial T}{\partial Z^{k}}.\tag{15.60}
Note that this fact justifies our earlier use of the symbol k\nabla_{k} to denote partial derivatives of variants of order zero.
Recall that the Christoffel symbol vanishes in affine coordinates, i.e.
Γijk=0.(12.51)\Gamma_{ij}^{k}=0. \tag{12.51}
As a result, the covariant derivative coincides with the partial derivative, i.e.
kTji=TjiZk,(15.61)\nabla_{k}T_{j}^{i}=\frac{\partial T_{j}^{i}}{\partial Z^{k}},\tag{15.61}
in affine coordinates.
This seemingly simple fact actually finds frequent and important applications. For instance, assume for a moment that the tensor property of the covariant derivative has already been demonstrated. Then, from our discussion in Section 14.2 it follows if the partial derivatives of a tensor vanish in an affine coordinate system, then its derivative vanishes in all coordinates. In particular, the metrinilic property
kZi=0(15.27)\nabla_{k}\mathbf{Z}_{i}=\mathbf{0} \tag{15.27}
follows from the fact that the partial derivatives of the coordinate basis vectors i\mathbf{i}, j\mathbf{j}, and k\mathbf{k} vanish in affine coordinates.
Furthermore, from the fact that partial derivatives commute it will follow that covariant derivatives commute as well. This important insight will be discussed later in this Chapter.
Crucially, the covariant derivative satisfies the familiar product rule. For example,
k(TiUj)=kTi Uj+Ti kUj.(15.62)\nabla_{k}\left( T^{i}U_{j}\right) =\nabla_{k}T^{i}\ U_{j}+T^{i}\ \nabla _{k}U_{j}.\tag{15.62}
Let us demonstrate this particular identity and, as always, it will be apparent that the rule holds generally for variants with arbitrary indicial signatures.
By definition, the covariant derivative k(TiUj)\nabla_{k}\left( T^{i}U_{j}\right) of TiUjT^{i}U_{j} is given by
k(TiUj)=(TiUj)Zk+ΓkmiTmUjΓkjmTiUm.(15.63)\nabla_{k}\left( T^{i}U_{j}\right) =\frac{\partial\left( T^{i}U_{j}\right) }{\partial Z^{k}}+\Gamma_{km}^{i}T^{m}U_{j}-\Gamma_{kj}^{m}T^{i}U_{m}.\tag{15.63}
An application of the ordinary product rule to the partial derivative on the right yields
k(TiUj)=TiZkUj+TiUjZk+ΓkmiTmUjΓkjmTiUm.(15.64)\nabla_{k}\left( T^{i}U_{j}\right) =\frac{\partial T^{i}}{\partial Z^{k} }U_{j}+T^{i}\frac{\partial U_{j}}{\partial Z^{k}}+\Gamma_{km}^{i}T^{m} U_{j}-\Gamma_{kj}^{m}T^{i}U_{m}.\tag{15.64}
Now, group the terms on the right in the following way:
k(TiUj)=(TiZk+ΓkmiTm)Uj+Ti(UjZkΓkjmUm).(15.65)\nabla_{k}\left( T^{i}U_{j}\right) =\left( \frac{\partial T^{i}}{\partial Z^{k}}+\Gamma_{km}^{i}T^{m}\right) U_{j}+T^{i}\left( \frac{\partial U_{j} }{\partial Z^{k}}-\Gamma_{kj}^{m}U_{m}\right) .\tag{15.65}
Since
TiZk+ΓkmiTm=kTi and          (15.66)UjZkΓkjmUm=kUj,          (15.67)\begin{aligned}\frac{\partial T^{i}}{\partial Z^{k}}+\Gamma_{km}^{i}T^{m} & =\nabla _{k}T^{i}\text{ and}\ \ \ \ \ \ \ \ \ \ \left(15.66\right)\\\frac{\partial U_{j}}{\partial Z^{k}}-\Gamma_{kj}^{m}U_{m} & =\nabla _{k}U_{j},\ \ \ \ \ \ \ \ \ \ \left(15.67\right)\end{aligned}
we arrive at the desired result
k(TiUj)=kTi Uj+Ti kUj.(15.62)\nabla_{k}\left( T^{i}U_{j}\right) =\nabla_{k}T^{i}\ U_{j}+T^{i}\ \nabla _{k}U_{j}. \tag{15.62}
We encourage the reader to repeat the calculation for variants with more complicated indicial signatures, such as TjiUlmT_{j}^{i}U_{lm} and make sure that the expected product rule is valid.
Admittedly, the above demonstration of the product rule proved to be rather straightforward and not entirely unexpected. On the other hand, at the time when the definition of the covariant derivative first occurred to its inventors, the question of whether the product rule held was not at all obvious. We can only imagine their sense of satisfaction at discovering that it does indeed hold, allowing the idea to move forward.
Earlier, we established that the covariant and contravariant bases vanish under the covariant derivative, i.e.
kZi=0          (15.27)kZi=0.          (15.29)\begin{aligned}\nabla_{k}\mathbf{Z}_{i} & =\mathbf{0}\ \ \ \ \ \ \ \ \ \ \left(15.27\right)\\\nabla_{k}\mathbf{Z}^{i} & =\mathbf{0.} \ \ \ \ \ \ \ \ \ \ \left(15.29\right)\end{aligned}
The product rule, which is valid for dot products, allows us to easily extend this result to the metric tensors. Since the covariant metric tensor ZijZ_{ij} is given by the dot product
Zij=ZiZj,(14.5)Z_{ij}=\mathbf{Z}_{i}\cdot\mathbf{Z}_{j}, \tag{14.5}
an application of the product rule yields
kZij=kZiZj+ZikZj(15.68)\nabla_{k}Z_{ij}=\nabla_{k}\mathbf{Z}_{i}\cdot\mathbf{Z}_{j}+\mathbf{Z} _{i}\cdot\nabla_{k}\mathbf{Z}_{j}\tag{15.68}
and since each term on the right vanishes, we can conclude that
kZij=0.(15.69)\nabla_{k}Z_{ij}=0.\tag{15.69}
Similarly, the identity
Zij=ZiZj,(14.5)Z^{ij}=\mathbf{Z}^{i}\cdot\mathbf{Z}^{j}, \tag{14.5}
yields the analogous result for the contravariant metric tensor
kZij=0.(15.70)\nabla_{k}Z^{ij}=0.\tag{15.70}
Finally, since the Kronecker delta δji\delta_{j}^{i} is given by
δji=ZiZj,(15.71)\delta_{j}^{i}=\mathbf{Z}^{i}\cdot\mathbf{Z}_{j},\tag{15.71}
we are able to conclude that it, too, vanishes under the covariant derivative, i.e.
kδji=0.(15.72)\nabla_{k}\delta_{j}^{i}=0.\tag{15.72}
In Applications of Tensor Analysis, J.J. McConnell refers to the identities kZij=0\nabla_{k}Z_{ij}=0 and kZij=0\nabla_{k}Z^{ij}=0 as Ricci's lemma.
Thanks to the metrinilic property, the metrics freely pass through the covariant derivative. We have already observed this for the covariant and contravariant bases in Section 15.3. For example,
ZikTlij=k(TlijZi),(15.73)\mathbf{Z}_{i}\nabla_{k}T_{l}^{ij}=\nabla_{k}\left( T_{l}^{ij}\mathbf{Z} _{i}\right) ,\tag{15.73}
as can be seen by applying the product rule on the right and subsequently appealing to the metrinilic property. Similarly, the metric tensor also seamlessly passes through the covariant derivative, e.g.
ZjkiTl=i(ZjkTl).(15.74)Z_{jk}\nabla_{i}T^{l}=\nabla_{i}\left( Z_{jk}T^{l}\right) .\tag{15.74}
This important observation finds numerous applications. In particular, it makes index juggling safe, as we will demonstrate in the next Section.
Finally, we should point out that the above proof of the metrinilic property, by virtue of its reliance on the covariant basis and the geometric dot product, is essentially Euclidean. However, in a Euclidean space, we could give another justification for the metrinilic property. Note that the identities
kZi=0          (15.27)kZi=0          (15.29)kZij=0          (15.69)kZij=0          (15.70)iδkj=0          (15.72)\begin{aligned}\nabla_{k}\mathbf{Z}_{i} & =\mathbf{0}\ \ \ \ \ \ \ \ \ \ \left(15.27\right)\\\nabla_{k}\mathbf{Z}^{i} & =\mathbf{0}\ \ \ \ \ \ \ \ \ \ \left(15.29\right)\\\nabla_{k}Z_{ij} & =0\ \ \ \ \ \ \ \ \ \ \left(15.69\right)\\\nabla_{k}Z^{ij} & =0\ \ \ \ \ \ \ \ \ \ \left(15.70\right)\\\nabla_{i}\delta_{k}^{j} & =0 \ \ \ \ \ \ \ \ \ \ \left(15.72\right)\end{aligned}
are obviously true in Cartesian coordinates -- or any affine coordinates, for that matter. Indeed, in such coordinates, the bases, the metric tensors, and, of course, the Kronecker delta have constant elements while the covariant derivative coincides with the partial derivative. Thus, the result is zero. Meanwhile, by the flagship tensor property of the covariant derivative, which will be demonstrated below, the variants kZi\nabla_{k}\mathbf{Z}_{i}, kZi\nabla_{k}\mathbf{Z}^{i}, kZij\nabla_{k}Z_{ij}, kZij\nabla_{k}Z^{ij}, and kδji\nabla_{k}\delta_{j}^{i} are tensors. As such, if they vanish in one coordinate system, they vanish in all coordinate systems, which completes the argument.
This argument is also restricted Euclidean since it relies on the availability of an affine coordinate system. However, it is important to know that the metrinilic property with respect to the metric tensors and the Kronecker delta continues to hold in the more general Riemannian spaces. A more general way to demonstrate the metrinilic property that holds up in a Riemannian space involves a direct application of the definition of the covariant derivative. This approach can be found in one of the exercises at the end of the Chapter.
There are two operations related to index juggling to which we have become accustomed. The first is juggling a free index on both sides of an identity. For example, if
Ui=VjWji ,(15.75)U^{i}=V^{j}W_{\cdot j}^{i}\ ,\tag{15.75}
then lowering ii on both sides of the identity yields
Ui=VjWij.(15.76)U_{i}=V^{j}W_{ij}.\tag{15.76}
The second is allowing two dummy indices in a contraction to exchange flavors. For example, if
Ui=VjWji ,(15.75)U^{i}=V^{j}W_{\cdot j}^{i}\ , \tag{15.75}
then exchanging the flavors of the index jj between the two variants on the right yields
Ui=VjWij .(15.77)U^{i}=V_{j}W^{ij}\ .\tag{15.77}
Do these operations remain valid in the presence of a covariant derivative? For example, consider the identity
Uki=VjkWji .(15.78)U_{\cdot k}^{i}=V^{j}\nabla_{k}W_{\cdot j}^{i}\ .\tag{15.78}
First, let us ask whether the index ii can be lowered on both sides to produce
Uik=VjkWij ?(15.79)U_{ik}=V^{j}\nabla_{k}W_{ij}\ ?\tag{15.79}
In order to answer this question, we must recall the underlying mechanics of index juggling. The lowering of the index ii in the identity
Uki=VjkWji(15.78)U_{\cdot k}^{i}=V^{j}\nabla_{k}W_{\cdot j}^{i} \tag{15.78}
is achieved by contracting both sides with the covariant metric tensor ZirZ_{ir}, i.e.
ZirUki=VjZirkWji  .(15.80)Z_{ir}U_{\cdot k}^{i}=V^{j}Z_{ir}\nabla_{k}W_{\cdot j}^{i}\ \ .\tag{15.80}
Since, as we pointed out in the previous Section, the metric tensor ZirZ_{ir} moves freely across the covariant derivative k\nabla_{k}, we have
ZirUki=Vjk(ZirWji).(15.81)Z_{ir}U_{\cdot k}^{i}=V^{j}\nabla_{k}\left( Z_{ir}W_{\cdot j}^{i}\right) .\tag{15.81}
Therefore,
Urk=VjkWrj,(15.82)U_{rk}=V^{j}\nabla_{k}W_{rj},\tag{15.82}
and, after renaming rr into ii, we have
Uik=VjkWij.(15.83)U_{ik}=V^{j}\nabla_{k}W_{ij}.\tag{15.83}
In summary, the initial identity
Uki=VjkWji(15.78)U_{\cdot k}^{i}=V^{j}\nabla_{k}W_{\cdot j}^{i} \tag{15.78}
implies
Uik=VjkWij.(15.84)U_{ik}=V^{j}\nabla_{k}W_{ij}.\tag{15.84}
In other words, thanks to the metrinilic property, a free index can indeed be raised or lowered on both sides of an identity even if one or more variants are found under the covariant derivative.
Similarly, it can be shown that a dummy index can be juggled across the covariant derivative, i.e.
VjkWji=VjkWij.(15.85)V^{j}\nabla_{k}W_{\cdot j}^{i}=V_{j}\nabla_{k}W^{ij}.\tag{15.85}
The demonstration of this fact is left as an exercise.
Consider the expression
kTiji.(15.86)\nabla_{k}T_{ij}^{i}.\tag{15.86}
and note that it can be interpreted in two ways with respect to the order in which the covariant derivative and contraction are applied. On the one hand, it can be interpreted as the covariant derivative applied to the first-order variant Sj=TijiS_{j}=T_{ij}^{i}. In this interpretation, the expanded expression for the covariant derivative will have a single Christoffel term. On the other hand, it can be seen as the covariant derivative applied to the third-order variant TljiT_{lj}^{i} with the result subsequently contracted on ii and ll. In this interpretation, the expression for the covariant derivative will have three Christoffel terms. Fortunately, as we are about to show, both interpretations lead to the same result.
Let us first interpret kTiji\nabla_{k}T_{ij}^{i} as the covariant derivative applied to the variant Sj=TijiS_{j}=T_{ij}^{i} of order one. We have
kTiji=TijiZkΓkjmTimi.(15.87)\nabla_{k}T_{ij}^{i}=\frac{\partial T_{ij}^{i}}{\partial Z^{k}}-\Gamma _{kj}^{m}T_{im}^{i}.\tag{15.87}
In the alternative interpretation, let us apply the covariant derivative to the third-order variant TljiT_{lj}^{i} and subsequently perform the contraction. By the definition of the covariant derivative, we have
kTlji=TljiZk+ΓkmiTljmΓklmTmjiΓkjmTlmi.(15.88)\nabla_{k}T_{lj}^{i}=\frac{\partial T_{lj}^{i}}{\partial Z^{k}}+\Gamma _{km}^{i}T_{lj}^{m}-\Gamma_{kl}^{m}T_{mj}^{i}-\Gamma_{kj}^{m}T_{lm}^{i}.\tag{15.88}
Now, contract ii and ll:
kTiji=TijiZk+ΓkmiTijmΓkimTmjiΓkjmTimi.(15.89)\nabla_{k}T_{ij}^{i}=\frac{\partial T_{ij}^{i}}{\partial Z^{k}}+\Gamma _{km}^{i}T_{ij}^{m}-\Gamma_{ki}^{m}T_{mj}^{i}-\Gamma_{kj}^{m}T_{im}^{i}.\tag{15.89}
Note that the first two Christoffel terms, i.e. those that correspond to the indices ii and ll, cancel each other out. This can be seen by exchanging the names of the indices ii and mm in, say, the first term. Thus,
kTiji=TijiZkΓkjmTimi,(15.90)\nabla_{k}T_{ij}^{i}=\frac{\partial T_{ij}^{i}}{\partial Z^{k}}-\Gamma _{kj}^{m}T_{im}^{i},\tag{15.90}
which is consistent with the first interpretation.
We now turn to the most crucial property of the covariant derivative, i.e. its tensor property. It states that the result of applying the covariant derivative to a tensor is a tensor of one additional covariant order.
Our proof of this property will be based on an elegant inductive argument. For tensors of order zero, i.e. invariants, the tensor property of kT\nabla_{k}T follows from the fact that the covariant derivative kT\nabla_{k}T coincides with the partial derivative T/Zk\partial T/\partial Z^{k}, for which the tensor property was demonstrated in the previous Chapter. We will now turn our attention to first-order tensors and then show how to extend the proof to second- and higher-order tensors.

15.12.1Proof for a first-order covariant tensor

Consider a covariant tensor TiT_{i}. Our goal is to demonstrate the tensor property of kTi\nabla_{k}T_{i}. Of course, we could do this by considering the invariant vector field T=TiZi\mathbf{T}=T_{i}\mathbf{Z}^{i} and then repeating the argument given at the beginning of this Chapter based on the tensor property of kT\nabla_{k}\mathbf{T}. However, that argument has a few shortcomings. For example, it is not applicable to a tensor Ti\mathbf{T}_{i} with vector elements, since there is no such thing as TiZi\mathbf{T}_{i}\mathbf{Z}^{i}. However, even for tensors with scalar components, it is important to have a proof that utilizes only those objects that are available in the component space. Such a proof would remain valid in the context of Riemannian spaces which we will soon describe. Thus, we will choose to give a direct proof based on demonstrating that kTi\nabla_{k}T_{i} and kTi\nabla_{k^{\prime}}T_{i^{\prime} } are related by the proper transformation rule. Importantly, the analytical logistics of this approach also work for tensors Ti\mathbf{T}_{i} with vector elements. Furthermore, this proof will show the precise manner in which the non-tensor contributions to the transformation rules cancel each other out to produce a tensor.
In the primed coordinates, the variant kTi\nabla_{k^{\prime}}T_{i^{\prime}} is given by
kTi=TiZkΓkimTm.(15.91)\nabla_{k^{\prime}}T_{i^{\prime}}=\frac{\partial T_{i^{\prime}}}{\partial Z^{k^{\prime}}}-\Gamma_{k^{\prime}i^{\prime}}^{m^{\prime}}T_{m^{\prime}}.\tag{15.91}
We will now relate each of the elements on the right to their unprimed counterparts.
Let us start with Ti/Zk\partial T_{i^{\prime}}/\partial Z^{k^{\prime}}, for which the transformation rule was given in the previous Chapter, but never derived. Since TiT_{i} is a tensor, we have
Ti=TiJii.(15.92)T_{i^{\prime}}=T_{i}J_{i^{\prime}}^{i}.\tag{15.92}
As we have done previously on a number of occasions, treat this equation as an identity in the primed coordinates, i.e.
Ti(Z)=Ti(Z(Z))Jii(Z),(15.93)T_{i^{\prime}}\left( Z^{\prime}\right) =T_{i}\left( Z\left( Z^{\prime }\right) \right) J_{i^{\prime}}^{i}\left( Z^{\prime}\right) ,\tag{15.93}
and differentiate both sides with respect to ZkZ^{k^{\prime}}. By the product rule, we find
Ti(Z)Zk=Ti(Z)ZkZk(Z)ZkJii+TiJii(Z)Zk.(15.94)\frac{\partial T_{i^{\prime}}\left( Z^{\prime}\right) }{\partial Z^{k^{\prime}}}=\frac{\partial T_{i}\left( Z\right) }{\partial Z^{k}} \frac{\partial Z^{k}\left( Z^{\prime}\right) }{\partial Z^{k^{\prime}} }J_{i^{\prime}}^{i}+T_{i}\frac{\partial J_{i^{\prime}}^{i}\left( Z^{\prime }\right) }{\partial Z^{k^{\prime}}}.\tag{15.94}
Since
Zk(Z)Zk=Jkk   and   JiiZk=Jiki,(15.95)\frac{\partial Z^{k}\left( Z^{\prime}\right) }{\partial Z^{k^{\prime}} }=J_{k^{\prime}}^{k}\text{ \ \ and \ \ }\frac{\partial J_{i^{\prime}}^{i} }{\partial Z^{k^{\prime}}}=J_{i^{\prime}k^{\prime}}^{i},\tag{15.95}
we have
TiZk=TiZkJiiJkk+TiJiki.(15.96)\frac{\partial T_{i^{\prime}}}{\partial Z^{k^{\prime}}}=\frac{\partial T_{i} }{\partial Z^{k}}J_{i^{\prime}}^{i}J_{k^{\prime}}^{k}+T_{i}J_{i^{\prime }k^{\prime}}^{i}.\tag{15.96}
Thus, we have the transformation rule for the first term on the right in equation
kTi=TiZkΓkimTm(14.90)\nabla_{k^{\prime}}T_{i^{\prime}}=\frac{\partial T_{i^{\prime}}}{\partial Z^{k^{\prime}}}-\Gamma_{k^{\prime}i^{\prime}}^{m^{\prime}}T_{m^{\prime}} \tag{14.90}
Moving on to the second term, recall that
Γkim=ΓkimJmmJkkJii+JkimJmm,(13.127)\Gamma_{k^{\prime}i^{\prime}}^{m^{\prime}}=\Gamma_{ki}^{m}J_{m}^{m^{\prime} }J_{k^{\prime}}^{k}J_{i^{\prime}}^{i}+J_{k^{\prime}i^{\prime}}^{m} J_{m}^{m^{\prime}}, \tag{13.127}
or
Γkim=(ΓkimJkkJii+Jkim)Jmm.(15.97)\Gamma_{k^{\prime}i^{\prime}}^{m^{\prime}}=\left( \Gamma_{ki}^{m} J_{k^{\prime}}^{k}J_{i^{\prime}}^{i}+J_{k^{\prime}i^{\prime}}^{m}\right) J_{m}^{m^{\prime}}.\tag{15.97}
Thus, since JmmTm=TmJ_{m}^{m^{\prime}}T_{m^{\prime}}=T_{m}, we have
ΓkimTm=(ΓkimJkkJii+Jkim)JmmTm=(ΓkimJkkJii+Jkim)Tm.(15.98)\Gamma_{k^{\prime}i^{\prime}}^{m^{\prime}}T_{m^{\prime}}=\left( \Gamma _{ki}^{m}J_{k^{\prime}}^{k}J_{i^{\prime}}^{i}+J_{k^{\prime}i^{\prime}} ^{m}\right) J_{m}^{m^{\prime}}T_{m^{\prime}}=\left( \Gamma_{ki} ^{m}J_{k^{\prime}}^{k}J_{i^{\prime}}^{i}+J_{k^{\prime}i^{\prime}}^{m}\right) T_{m}.\tag{15.98}
Putting the two terms together, we find
kTi=TiZkJiiJkk+TiJiki(ΓkimJiiJkk+Jkim)Tm.(15.99)\nabla_{k^{\prime}}T_{i^{\prime}}=\frac{\partial T_{i}}{\partial Z^{k} }J_{i^{\prime}}^{i}J_{k^{\prime}}^{k}+T_{i}J_{i^{\prime}k^{\prime}} ^{i}-\left( \Gamma_{ki}^{m}J_{i^{\prime}}^{i}J_{k^{\prime}}^{k}+J_{k^{\prime }i^{\prime}}^{m}\right) T_{m}.\tag{15.99}
The two terms containing the second-order Jacobians -- i.e. the two non-tensor terms! -- cancel each other and we are left with
kTi=TiZkJiiJkkΓkimJiiJkkTm(15.100)\nabla_{k^{\prime}}T_{i^{\prime}}=\frac{\partial T_{i}}{\partial Z^{k} }J_{i^{\prime}}^{i}J_{k^{\prime}}^{k}-\Gamma_{ki}^{m}J_{i^{\prime}} ^{i}J_{k^{\prime}}^{k}T_{m}\tag{15.100}
or, factoring out the Jacobians,
kTi=(TiZkΓkimTm)JiiJkk.(15.101)\nabla_{k^{\prime}}T_{i^{\prime}}=\left( \frac{\partial T_{i}}{\partial Z^{k}}-\Gamma_{ki}^{m}T_{m}\right) J_{i^{\prime}}^{i}J_{k^{\prime}}^{k}.\tag{15.101}
Finally, since the quantity in parentheses corresponds to kTi\nabla_{k}T_{i}, we have
kTi=kTiJiiJkk,(15.102)\nabla_{k^{\prime}}T_{i^{\prime}}=\nabla_{k}T_{i}J_{i^{\prime}}^{i} J_{k^{\prime}}^{k},\tag{15.102}
precisely as we set out to show.

15.12.2Proof for a first-order contravariant tensor

Naturally, the tensor property of kTi\nabla_{k}T^{i} for a contravariant tensor TiT^{i} can be demonstrated in the exact same manner as we just used for a covariant tensor TiT_{i}. Pursuing this approach is left as an exercise. Instead, we will now use a different approach based on the use of the quotient theorem. The beauty of this approach is that it can also be used in the inductive step of the overall proof.
Consider a contravariant tensor TiT^{i} and, for any other tensor ViV_{i}, form the invariant
T=TiVi.(15.103)T=T^{i}V_{i}.\tag{15.103}
Apply the contravariant derivative to both sides of the above identity:
kT=k(TiVi).(15.104)\nabla_{k}T=\nabla_{k}\left( T^{i}V_{i}\right) .\tag{15.104}
By the product rule, we have
kT=kTi Vi+Ti kVi.(15.105)\nabla_{k}T=\nabla_{k}T^{i}~V_{i}+T^{i}~\nabla_{k}V_{i}.\tag{15.105}
The variant kTi\nabla_{k}T^{i}, whose tensor property we are trying to demonstrate, is found in the term kTi Vi\nabla_{k}T^{i}~V_{i} on the right. Solving for that term, we find:
kTi Vi=kTTi kVi.(15.106)\nabla_{k}T^{i}~V_{i}=\nabla_{k}T-T^{i}~\nabla_{k}V_{i}.\tag{15.106}
The combination kTi Vi\nabla_{k}T^{i}~V_{i} is clearly a tensor since each term on the right is a tensor according to the properties proven earlier. Thanks to the arbitrariness of ViV_{i}, the quotient theorem tells us that kTi\nabla _{k}T^{i} is a tensor in its own right, as we set out to prove.

15.12.3The inductive step

For higher-order tensors, the tensor property of the covariant derivative can be shown by increasing the order by one unit at a time. For a tensor order two, say TjiT_{j}^{i}, construct a first-order tensor TiT^{i} by contracting TjiT_{j}^{i} with an arbitrary tensor VjV^{j}:
Ti=TjiVj.(15.107)T^{i}=T_{j}^{i}V^{j}.\tag{15.107}
As before, take the covariant derivative of both sides
kTi=k(TjiVj).(15.108)\nabla_{k}T^{i}=\nabla_{k}\left( T_{j}^{i}V^{j}\right) .\tag{15.108}
By the product rule,
kTi=kTji Vj+Tji kVj.(15.109)\nabla_{k}T^{i}=\nabla_{k}T_{j}^{i}~V^{j}+T_{j}^{i}~\nabla_{k}V^{j}.\tag{15.109}
As before, solve for the term containing kTji\nabla_{k}T_{j}^{i}:
kTji Vj=kTiTji kVj.(15.110)\nabla_{k}T_{j}^{i}~V^{j}=\nabla_{k}T^{i}-T_{j}^{i}~\nabla_{k}V^{j}.\tag{15.110}
Once again, all terms on the right are tensors which leads to the conclusion that iTkj\nabla_{i}T_{k}^{j} is a tensor by the quotient theorem.
It is clear that continuing in this fashion, we can demonstrate the tensor property of the covariant derivative with respect to tensors of any order and any indicial signature. This completes the proof of this linchpin property.
The tensor property guarantees the geometric meaningfulness of the covariant derivative and opens the floodgates for producing differential invariants. For example, for a tensor UiU^{i}, the variant iUj\nabla_{i}U^{j} is a tensor in its own right. Therefore, the combination
iUi(15.111)\nabla_{i}U^{i}\tag{15.111}
is an invariant. In fact, it is the celebrated divergence operator. We may not immediately know its precise geometric meaning but we can be confident that this quantity is worthy of analysis. Similarly, for an invariant UU, the combination ZijijUZ^{ij}\nabla_{i}\nabla_{j}U, which can also be expressed in the more compact form
iiU(15.112)\nabla_{i}\nabla^{i}U\tag{15.112}
is an invariant. Of course, you may recognize it as the Laplace operator or the Laplacian of UU. Note that both, the divergence and the Laplacian, have elegant geometric interpretations that will discussed in Chapter 18.
The covariant derivatives commute, i.e.
kl=lk,(15.113)\nabla_{k}\nabla_{l}=\nabla_{l}\nabla_{k},\tag{15.113}
when applied to tensors of arbitrary order. To prove this statement, let TjiT_{j}^{i} be a tensor with a representative collection of indices. Consider the combination
klTjilkTji,(15.114)\nabla_{k}\nabla_{l}T_{j}^{i}-\nabla_{l}\nabla_{k}T_{j}^{i},\tag{15.114}
known as a commutator, that is often denoted by
(kllk)Tji.(15.115)\left( \nabla_{k}\nabla_{l}-\nabla_{l}\nabla_{k}\right) T_{j}^{i}.\tag{15.115}
By the tensor property of the covariant derivative, we know that the commutator is a tensor. Furthermore, in affine coordinates, where the covariant derivative coincides with the partial derivative, we have
klTjilkTji=2TjiZkZl2TjiZlZk.(15.116)\nabla_{k}\nabla_{l}T_{j}^{i}-\nabla_{l}\nabla_{k}T_{j}^{i}=\frac{\partial ^{2}T_{j}^{i}}{\partial Z^{k}\partial Z^{l}}-\frac{\partial^{2}T_{j}^{i} }{\partial Z^{l}\partial Z^{k}}.\tag{15.116}
Since partial derivatives commute, we conclude that -- in affine coordinates -- the commutator vanishes, i.e.
klTjilkTji=0.(15.117)\nabla_{k}\nabla_{l}T_{j}^{i}-\nabla_{l}\nabla_{k}T_{j}^{i}=0.\tag{15.117}
Recall that a tensor that vanishes in one coordinate system vanishes in all coordinate systems. Therefore, the identity
klTjilkTji=0(15.118)\nabla_{k}\nabla_{l}T_{j}^{i}-\nabla_{l}\nabla_{k}T_{j}^{i}=0\tag{15.118}
holds in all coordinate systems, as we set out to prove.
The commutative property of the covariant derivative appears quite simple and, perhaps, it is. But it is also a fact of utmost profundity. Our proof of it relied not only on the tensor property of the covariant derivative, but also on the availability of affine coordinates, which is a signature characteristic of Euclidean spaces. Therefore, our proof was inextricably linked to the Euclidean nature of space. In fact, in other kinds of spaces -- in particular, Riemannian spaces -- where all other facts that went into the proof remain intact but affine coordinates are not available, commutativity no longer holds. Thus, we must interpret the commutative property of the covariant derivative as a profound analytical characteristic of Euclidean spaces. This insight is captured beautifully by the Riemann-Christoffel tensor which we will now introduce.
Until now, we tended to use indices kk and ll for the covariant derivatives and ii and jj for the variants. Let us now switch to the more common choice used in the context of the Riemann-Christoffel tensor, where ii and jj are typically used for the covariant derivative.
Consider the commutator
ijTkjiTk(15.119)\nabla_{i}\nabla_{j}T^{k}-\nabla_{j}\nabla_{i}T^{k}\tag{15.119}
which, as we know, vanishes for all TkT^{k}. Of course, that should not prevent us from expanding the covariant derivatives in terms of the underlying partial derivatives and Christoffel symbols. It is left as an exercise to show that
ijTkjiTk=(ΓjmkZiΓimkZj+ΓinkΓjmnΓjnkΓimn)Tm.(15.120)\nabla_{i}\nabla_{j}T^{k}-\nabla_{j}\nabla_{i}T^{k}=\left( \frac {\partial\Gamma_{jm}^{k}}{\partial Z^{i}}-\frac{\partial\Gamma_{im}^{k} }{\partial Z^{j}}+\Gamma_{in}^{k}\Gamma_{jm}^{n}-\Gamma_{jn}^{k}\Gamma _{im}^{n}\right) T^{m}.\tag{15.120}
As expected, the second-order derivatives of TkT^{k},
2TkZiZj and 2TkZjZi,(15.121)\frac{\partial^{2}T^{k}}{\partial Z^{i}\partial Z^{j}}\text{ and } \frac{\partial^{2}T^{k}}{\partial Z^{j}\partial Z^{i}},\tag{15.121}
cancelled each other. What is somewhat surprising -- and, at the same time, critical to the entire analysis -- is that the first-order derivatives of TkT^{k} also cancelled each other.
Since the commutator ijTkjiTk\nabla_{i}\nabla_{j}T^{k}-\nabla_{j}\nabla_{i}T^{k} is a tensor and, therefore,
(ΓjmkZiΓimkZj+ΓinkΓjmnΓjnkΓimn)Tm(15.122)\left( \frac{\partial\Gamma_{jm}^{k}}{\partial Z^{i}}-\frac{\partial \Gamma_{im}^{k}}{\partial Z^{j}}+\Gamma_{in}^{k}\Gamma_{jm}^{n}-\Gamma _{jn}^{k}\Gamma_{im}^{n}\right) T^{m}\tag{15.122}
is a tensor for all TmT^{m}, we know from the quotient theorem that the parenthesized expression is a tensor in its own right. It is known as the Riemann-Christoffel tensor and is denoted by RmijkR_{\cdot mij}^{k}:
Rmijk=ΓjmkZiΓimkZj+ΓinkΓjmnΓjnkΓimn.(15.123)R_{\cdot mij}^{k}=\frac{\partial\Gamma_{jm}^{k}}{\partial Z^{i}} -\frac{\partial\Gamma_{im}^{k}}{\partial Z^{j}}+\Gamma_{in}^{k}\Gamma_{jm} ^{n}-\Gamma_{jn}^{k}\Gamma_{im}^{n}.\tag{15.123}
With the help of RmijkR_{\cdot mij}^{k}, the commutator ijTkjiTk\nabla_{i}\nabla _{j}T^{k}-\nabla_{j}\nabla_{i}T^{k} is captured concisely by the identity
ijTkjiTk=RmijkTm.(15.124)\nabla_{i}\nabla_{j}T^{k}-\nabla_{j}\nabla_{i}T^{k}=R_{\cdot mij}^{k}T^{m}.\tag{15.124}
Since, in a Euclidean space, the commutator ijTkjiTk\nabla_{i}\nabla_{j}T^{k} -\nabla_{j}\nabla_{i}T^{k} vanishes, i.e.
RmijkTm=0(15.125)R_{\cdot mij}^{k}T^{m}=0\tag{15.125}
for all tensors TmT^{m}, we can conclude that all elements of the Riemann-Christoffel tensor are zero, i.e.
Rmijk=0.(15.126)R_{\cdot mij}^{k}=0.\tag{15.126}
Once again, we should point out that although this identity was obtained with relative ease, it is highly nontrivial and is of great depth. It states that for any coordinate system, the identity
ΓjmkZiΓimkZj+ΓinkΓjmnΓjnkΓimn=0(15.127)\frac{\partial\Gamma_{jm}^{k}}{\partial Z^{i}}-\frac{\partial\Gamma_{im}^{k} }{\partial Z^{j}}+\Gamma_{in}^{k}\Gamma_{jm}^{n}-\Gamma_{jn}^{k}\Gamma _{im}^{n}=0\tag{15.127}
holds at all points. This is a profound analytical characterization of a Euclidean space and is a direct consequence of the fact that a Euclidean space admits affine coordinates.
Exercise 15.1Use the inductive approach described in Section 15.4 to show that a reasonable definition for k\nabla_{k} when applied to a tensor TijT_{ij} is
kTij=TijZkΓkimTmjΓkjmTim.(15.128)\nabla_{k}T_{ij}=\frac{\partial T_{ij}}{\partial Z^{k}}-\Gamma_{ki}^{m} T_{mj}-\Gamma_{kj}^{m}T_{im}.\tag{15.128}
Exercise 15.2Use the inductive approach described in Section 15.4 to show that a reasonable definition for k\nabla_{k} when applied to a tensor TijT^{ij} is
kTij=TijZk+ΓkmiTmj+ΓkmjTim.(15.129)\nabla_{k}T^{ij}=\frac{\partial T^{ij}}{\partial Z^{k}}+\Gamma_{km}^{i} T^{mj}+\Gamma_{km}^{j}T^{im}.\tag{15.129}
Exercise 15.3Suppose that in polar coordinates, the contravariant tensor field Ui(r,θ)U^{i}\left( r,\theta\right) is given by
U1(r,θ)=1          (15.130)U2(r,θ)=1          (15.131)\begin{aligned}U^{1}\left( r,\theta\right) & =1\ \ \ \ \ \ \ \ \ \ \left(15.130\right)\\U^{2}\left( r,\theta\right) & =1\ \ \ \ \ \ \ \ \ \ \left(15.131\right)\end{aligned}
Show that iUj\nabla_{i}U^{j} is given by
1U1=0          (15.132)1U2=r1          (15.133)2U1=r          (15.134)2U2=r1          (15.135)\begin{aligned}\nabla_{1}U^{1} & =0\ \ \ \ \ \ \ \ \ \ \left(15.132\right)\\\nabla_{1}U^{2} & =r^{-1}\ \ \ \ \ \ \ \ \ \ \left(15.133\right)\\\nabla_{2}U^{1} & =-r\ \ \ \ \ \ \ \ \ \ \left(15.134\right)\\\nabla_{2}U^{2} & =r^{-1}\ \ \ \ \ \ \ \ \ \ \left(15.135\right)\end{aligned}
Explain why it is not surprising that the covariant derivative of a tensor field with constant elements is not zero.
Exercise 15.4Suppose that in polar coordinates, the contravariant tensor Ui(r,θ)U^{i}\left( r,\theta\right) is given by
U1(r,θ)=sinθ          (15.136)U2(r,θ)=r1cosθ          (15.137)\begin{aligned}U^{1}\left( r,\theta\right) & =\sin\theta\ \ \ \ \ \ \ \ \ \ \left(15.136\right)\\U^{2}\left( r,\theta\right) & =r^{-1}\cos\theta\ \ \ \ \ \ \ \ \ \ \left(15.137\right)\end{aligned}
Show that
iUj=0.(15.138)\nabla_{i}U^{j}=0.\tag{15.138}
Explain why this answer makes sense.
Exercise 15.5Denote the (rarely considered) components of the position vector R\mathbf{R} by RiR^{i}. Show that
jRi=δji.(15.139)\nabla_{j}R^{i}=\delta_{j}^{i}.\tag{15.139}
Exercise 15.6Confirm the above relationship in polar, cylindrical, and spherical coordinates.
Exercise 15.7Explain why in Cartesian coordinates the covariant derivative coincides with the partial derivative.
Exercise 15.8Show by a direct application of the definition of the covariant derivative that it is metrinilic with respect to the metric tensors and the Kronecker delta, i.e.
kZij=0          (15.69)kZij=0          (15.70)iδkj=0.          (15.72)\begin{aligned}\nabla_{k}Z_{ij} & =0\ \ \ \ \ \ \ \ \ \ \left(15.69\right)\\\nabla_{k}Z^{ij} & =0\ \ \ \ \ \ \ \ \ \ \left(15.70\right)\\\nabla_{i}\delta_{k}^{j} & =0. \ \ \ \ \ \ \ \ \ \ \left(15.72\right)\end{aligned}
Exercise 15.9Show (or, rather, note) that the product rule applies to the dot product of vector-valued variants.
Exercise 15.10Show that dummy indices in a contraction can exchange their placements "across" the covariant derivative, e.g.
VjkWji=VjkWij.(15.85)V^{j}\nabla_{k}W_{\cdot j}^{i}=V_{j}\nabla_{k}W^{ij}. \tag{15.85}
Exercise 15.11Use the approach outlined in Section 15.12.1 to demonstrate the tensor property of kTi\nabla_{k}T^{i}, i.e.
kTi=kTiJiiJkk.(15.140)\nabla_{k^{\prime}}T^{i^{\prime}}=\nabla_{k}T^{i}J_{i}^{i^{\prime} }J_{k^{\prime}}^{k}.\tag{15.140}
Exercise 15.12Use the approach outlined in Section 15.12.1 to demonstrate the tensor property of kTji\nabla_{k}T_{j}^{i}, i.e.
kTji=kTjiJiiJjjJkk.(15.141)\nabla_{k^{\prime}}T_{j^{\prime}}^{i^{\prime}}=\nabla_{k}T_{j}^{i} J_{i}^{i^{\prime}}J_{j^{\prime}}^{j}J_{k^{\prime}}^{k}.\tag{15.141}
This should prove to be a very time-consuming, yet worthwhile, exercise.
Exercise 15.13Show that in Cartesian coordinates, the Laplacian iiU\nabla_{i}\nabla^{i}U is given by
iiU=2Ux2+2Uy2(15.142)\nabla_{i}\nabla^{i}U=\frac{\partial^{2}U}{\partial x^{2}}+\frac{\partial ^{2}U}{\partial y^{2}}\tag{15.142}
in two dimensions and by
iiU=2Ux2+2Uy2+2Uz2(15.143)\nabla_{i}\nabla^{i}U=\frac{\partial^{2}U}{\partial x^{2}}+\frac{\partial ^{2}U}{\partial y^{2}}+\frac{\partial^{2}U}{\partial z^{2}}\tag{15.143}
in three dimensions. Are these formulas valid in affine coordinates?
Exercise 15.14Use the fact that the covariant derivatives commute, i.e.
kllk=0,(15.113)\nabla_{k}\nabla_{l}-\nabla_{l}\nabla_{k}=0, \tag{15.113}
to show that the contravariant derivatives commute, i.e.
kllk=0,(15.144)\nabla^{k}\nabla^{l}-\nabla^{l}\nabla^{k}=0,\tag{15.144}
and that so do the covariant and the contravariant derivatives, i.e.
kllk=0.(15.145)\nabla^{k}\nabla_{l}-\nabla_{l}\nabla^{k}=0.\tag{15.145}
Thus, in a Laplacian iiU\nabla_{i}\nabla^{i}U, the order of the derivatives does not matter, i.e.
iiU=iiU.(15.146)\nabla_{i}\nabla^{i}U=\nabla^{i}\nabla_{i}U.\tag{15.146}
Exercise 15.15Show that
ijTkjiTk=(ΓjmkZiΓimkZj+ΓinkΓjmnΓjnkΓimn)Tm.(15.120)\nabla_{i}\nabla_{j}T^{k}-\nabla_{j}\nabla_{i}T^{k}=\left( \frac {\partial\Gamma_{jm}^{k}}{\partial Z^{i}}-\frac{\partial\Gamma_{im}^{k} }{\partial Z^{j}}+\Gamma_{in}^{k}\Gamma_{jm}^{n}-\Gamma_{jn}^{k}\Gamma _{im}^{n}\right) T^{m}. \tag{15.120}
Exercise 15.16Confirm the Riemann-Christoffel identity
ΓjmkZiΓimkZj+ΓinkΓjmnΓjnkΓimn=0(15.127)\frac{\partial\Gamma_{jm}^{k}}{\partial Z^{i}}-\frac{\partial\Gamma_{im}^{k} }{\partial Z^{j}}+\Gamma_{in}^{k}\Gamma_{jm}^{n}-\Gamma_{jn}^{k}\Gamma _{im}^{n}=0 \tag{15.127}
in polar coordinates. This is a time-consuming exercise aimed at building experience with literal calculations represented by compact indicial equations.
Exercise 15.17Show that the covariant derivatives commute, i.e.
ijTkjiTk=0(15.147)\nabla_{i}\nabla_{j}T^{k}-\nabla_{j}\nabla_{i}T^{k}=0\tag{15.147}
when applied not only to tensors, but to arbitrary variants.
Problem 15.1For a field FF, formulate an appropriate definition for the second-order directional derivative
d2Fdl2(15.148)\frac{d^{2}F}{dl^{2}}\tag{15.148}
along the ray ll and show that
d2Fdl2=LiLjijF,(15.149)\frac{d^{2}F}{dl^{2}}=L^{i}L^{j}\nabla_{i}\nabla_{j}F,\tag{15.149}
where LiL^{i} are the components of the unit vector L\mathbf{L} that points in the direction of the ray ll. This problem can be solved relatively easily by introducing affine coordinates. However, it will be more fulfilling and will serve to greatly increase your understanding of covariant differentiation if you solve it in general curvilinear coordinates by leveraging the δ/δt\delta/\delta t-derivative, introduced in Section 12.7.5, along the ray ll.
Send feedback to Pavel