Illustrative Applications of the Tensor Notation

In this Chapter, we will describe -- in tensor notation -- a number of fundamental concepts that will be highly relevant to the rest of our narrative. Thus, this Chapter will not only help you increase your proficiency in the tensor notation, but will also help you prepare for what is to come. Among other topics, we will cover linear combinations, the dot product, differentiation of multivariable functions, as well as multivariable inverse functions. By the end of the Chapter, we will be ready to return to Euclidean spaces in order to begin constructing the tensor framework.
Let us begin with linear combinations, the fundamental algebraic building block of Linear Algebra. As a sum of products, a linear combination has the potential to be interpreted as a contraction but, first, an important adjustment needs to be made.
Consider the decomposition of a vector U\mathbf{U} as a linear combination of the basis vectors b1\mathbf{b}_{1}, b2\mathbf{b}_{2}, and b3\mathbf{b}_{3} with coefficients α1\alpha_{1}, α2\alpha_{2}, and α3\alpha_{3}, i.e.
U=α1b1+α2b2+α3b3.(8.1)\mathbf{U}=\alpha_{1}\mathbf{b}_{1}+\alpha_{2}\mathbf{b}_{2}+\alpha _{3}\mathbf{b}_{3}.\tag{8.1}
Since both indices are subscripts, the above summation does not represent a valid contraction. In order to make it one, either the coefficients αi\alpha_{i} or the basis vectors bi\mathbf{b}_{i} must be enumerated by a superscript. We arbitrarily choose the coefficients. With the coefficients denoted by α1\alpha^{1}, α2\alpha^{2}, and α3\alpha^{3}, the linear combination becomes
U=α1b1+α2b2+α3b3.(8.2)\mathbf{U}=\alpha^{1}\mathbf{b}_{1}+\alpha^{2}\mathbf{b}_{2}+\alpha ^{3}\mathbf{b}_{3}.\tag{8.2}
As a result, it is now a valid contraction. Invoking the summation convention, we can now express U\mathbf{U} by the remarkably compact equation
U=αibi.(8.3)\mathbf{U}=\alpha^{i}\mathbf{b}_{i}.\tag{8.3}
This example perfectly illustrates the guiding aspect of the tensor framework. Our decision to enumerate the coefficients αi\alpha^{i} by a superscript was governed by nothing other than our desire to invoke the summation convention. However, in doing so, we have made a prediction of how the coefficients αi\alpha^{i} transform under a change of basis. Crucially, our prediction is correct, as will be confirmed in Chapter 14. We have highlighted this surprising aspect of the tensor notation before but it is worth reiterating: it not only captures the properties of objects, but also predicts them.
We will frequently encounter the situation where the decomposition of one and the same vector with respect to one and the same basis is arrived at in two different ways. Since, as it is well known from elementary Linear Algebra, the decomposition is unique, the decomposition coefficients -- i.e. the components -- in the two alternatively-derived expansions must coincide. In other words, from the equality of two linear combinations we can conclude the equality of the coefficients. While this is a straightforward matter, the tensor notation captures linear combinations in such a compact manner that this logic may sometimes prove elusive. It is therefore worth illustrating with a few examples.
Suppose that the basis consists of the vectors bi\mathbf{b}_{i}. Then from
αibi=γibi,(8.4)\alpha^{i}\mathbf{b}_{i}=\gamma^{i}\mathbf{b}_{i},\tag{8.4}
we can conclude that
αi=γi.(8.5)\alpha^{i}=\gamma^{i}.\tag{8.5}
To see how this conclusion is reached, unpack both sides of the former equation, i.e.
α1b1+α2b2+α3b3=γ1b1+γ2b2+γ3b3.(8.6)\alpha^{1}\mathbf{b}_{1}+\alpha^{2}\mathbf{b}_{2}+\alpha^{3}\mathbf{b} _{3}=\gamma^{1}\mathbf{b}_{1}+\gamma^{2}\mathbf{b}_{2}+\gamma^{3} \mathbf{b}_{3}.\tag{8.6}
With the equation in this form, it becomes clear that the Linear Algebra principle of equating expansion coefficients applies. Also, the unpacked form prevents you from slipping into the incorrect argument of "dividing" both sides of the identity αibi=γibi\alpha^{i}\mathbf{b}_{i}=\gamma^{i}\mathbf{b}_{i} by bi\mathbf{b}_{i}. Note that even as your experience with the tensor notation grows, you will not cease to practice unpacking. You will simply learn to imagine and process the unpacked form more quickly.
The equality of linear combinations may appear in more algebraically complicated forms. For example, from the identity
Ajiαjbi=γibi,(8.7)A_{j}^{i}\alpha^{j}\mathbf{b}_{i}=\gamma^{i}\mathbf{b}_{i},\tag{8.7}
we can conclude that
Ajiαj=γi.(8.8)A_{j}^{i}\alpha^{j}=\gamma^{i}.\tag{8.8}
Note that we need not concern ourselves with the interpretation of the combination AjiαjA_{j}^{i}\alpha^{j} in order to reach this conclusion.
In situations where the repeated indices enumerating the elements of the basis are different on the two sides of the equation, e.g.
Ajkiαjβkbi=Dijγibj,(8.9)A_{jk}^{i}\alpha^{j}\beta^{k}\mathbf{b}_{i}=D_{i}^{j}\gamma^{i}\mathbf{b}_{j},\tag{8.9}
you will need to rename one of the pairs of the repeated indices so that the two pairs match. In the above example, switch the roles of ii and jj on the right, i.e.
Ajkiαjβkbi=Djiγjbi.(8.10)A_{jk}^{i}\alpha^{j}\beta^{k}\mathbf{b}_{i}=D_{j}^{i}\gamma^{j}\mathbf{b}_{i}.\tag{8.10}
From this form of the identity, we are able to conclude that
Ajkiαjβk=Djiγj.(8.11)A_{jk}^{i}\alpha^{j}\beta^{k}=D_{j}^{i}\gamma^{j}\mathbf{.}\tag{8.11}
I hope that the foregoing examples illustrate that while the tensor notation may require some practice, its compactness does not obfuscate the ideas that you may be used to seeing in the unpacked form.
In the near future, we will study transformation of systems under a change of coordinates. The corresponding topic in Linear Algebra is referred to as change of basis. In this Section, we will use the tensor notation to demonstrate that the matrices relating the two bases are the inverses of each other.
Suppose that bi\mathbf{b}_{i} and ci\mathbf{c}_{i} are two alternative bases. Let the system BjiB_{j}^{i} express the elements of the basis bj\mathbf{b}_{j} with respect to ci\mathbf{c}_{i}, i.e.
bj=Bjici,(8.12)\mathbf{b}_{j}=B_{j}^{i}\mathbf{c}_{i},\tag{8.12}
and CjiC_{j}^{i} express the elements of the basis cj\mathbf{c}_{j} with respect to bi\mathbf{b}_{i}, i.e.
cj=Cjibi.(8.13)\mathbf{c}_{j}=C_{j}^{i}\mathbf{b}_{i}.\tag{8.13}
In other words, BjiB_{j}^{i} is the component of bj\mathbf{b}_{j} with respect to ci\mathbf{c}_{i} and CjiC_{j}^{i} is the component of cj\mathbf{c}_{j} with respect to bi\mathbf{b}_{i}. Because the systems BjiB_{j}^{i} and CjiC_{j}^{i} perform opposite conversions, we may anticipate that they are the inverses of each other in the matrix sense. This is, indeed, the case. Furthermore, this fact is naturally demonstrated in the tensor notation and will offer us an opportunity to practice combining indicial expressions.
The idea is to substitute the identity cj=Cjibi\mathbf{c}_{j}=C_{j}^{i}\mathbf{b} _{i} into bj=Bjici\mathbf{b}_{j}=B_{j}^{i}\mathbf{c}_{i} and thus express the basis bi\mathbf{b}_{i} in terms of itself. However, since the equation
cj=Cjibi(8.13)\mathbf{c}_{j}=C_{j}^{i}\mathbf{b}_{i}\tag{8.13}
features cj\mathbf{c}_{j} (with a jj) while the equation
bj=Bjici(8.12)\mathbf{b}_{j}=B_{j}^{i}\mathbf{c}_{i}\tag{8.12}
features ci\mathbf{c}_{i} (with an ii), substitution is not possible until we properly coordinate the index names. In the identity
cj=Cjibi(8.13)\mathbf{c}_{j}=C_{j}^{i}\mathbf{b}_{i}\tag{8.13}
rename ii into kk (in order to free up ii for the next step), i.e.
cj=Cjkbk,(8.14)\mathbf{c}_{j}=C_{j}^{k}\mathbf{b}_{k},\tag{8.14}
and then jj into ii, i.e.
ci=Cikbk.(8.15)\mathbf{c}_{i}=C_{i}^{k}\mathbf{b}_{k}.\tag{8.15}
Next, substitute the above identity into
bj=Bjici,(8.12)\mathbf{b}_{j}=B_{j}^{i}\mathbf{c}_{i},\tag{8.12}
which yields
bj=BjiCikbk.(8.16)\mathbf{b}_{j}=B_{j}^{i}C_{i}^{k}\mathbf{b}_{k}.\tag{8.16}
From here, one can proceed according to one of two approaches: the tensor-novice or the tensor-expert. We will describe both approaches. The tensor-novice approach analyzes the elementary operations that take place under the hood. The tensor-expert approach showcases the elegance and the efficiency of the tensor notation.
For the tensor-novice approach, begin by unpacking the identity
bj=BjiCikbk(8.16)\mathbf{b}_{j}=B_{j}^{i}C_{i}^{k}\mathbf{b}_{k} \tag{8.16}
on the dummy index kk, i.e.
bj=BjiCi1b1+BjiCi2b2+BjiCi3b3,(8.17)\mathbf{b}_{j}=B_{j}^{i}C_{i}^{1}\mathbf{b}_{1}+B_{j}^{i}C_{i}^{2} \mathbf{b}_{2}+B_{j}^{i}C_{i}^{3}\mathbf{b}_{3},\tag{8.17}
and subsequently on the live index jj which, as we discussed in the previous Chapter, expands the equation into a set of three, i.e.
b1=B1iCi1b1+B1iCi2b2+B1iCi3b3          (8.18)b2=B2iCi1b1+B2iCi2b2+B2iCi3b3          (8.19)b3=B3iCi1b1+B3iCi2b2+B3iCi3b3,          (8.20)\begin{aligned}\mathbf{b}_{1} & =B_{1}^{i}C_{i}^{1}\mathbf{b}_{1}+B_{1}^{i}C_{i} ^{2}\mathbf{b}_{2}+B_{1}^{i}C_{i}^{3}\mathbf{b}_{3}\ \ \ \ \ \ \ \ \ \ \left(8.18\right)\\\mathbf{b}_{2} & =B_{2}^{i}C_{i}^{1}\mathbf{b}_{1}+B_{2}^{i}C_{i} ^{2}\mathbf{b}_{2}+B_{2}^{i}C_{i}^{3}\mathbf{b}_{3}\ \ \ \ \ \ \ \ \ \ \left(8.19\right)\\\mathbf{b}_{3} & =B_{3}^{i}C_{i}^{1}\mathbf{b}_{1}+B_{3}^{i}C_{i} ^{2}\mathbf{b}_{2}+B_{3}^{i}C_{i}^{3}\mathbf{b}_{3},\ \ \ \ \ \ \ \ \ \ \left(8.20\right)\end{aligned}
where the identities are still not fully unpacked: each equation contains three un-unpacked contractions on the index ii.
The above identities express the elements of the basis bi\mathbf{b}_{i} with respect to itself. From elementary Linear Algebra, we know that there is a unique way of doing so and that is
b1=1b1+0b2+0b3          (8.21)b2=0b1+1b2+0b3          (8.22)b3=0b1+0b2+1b3.          (8.23)\begin{aligned}\mathbf{b}_{1} & =1\mathbf{b}_{1}+0\mathbf{b}_{2}+0\mathbf{b}_{3}\ \ \ \ \ \ \ \ \ \ \left(8.21\right)\\\mathbf{b}_{2} & =0\mathbf{b}_{1}+1\mathbf{b}_{2}+0\mathbf{b}_{3}\ \ \ \ \ \ \ \ \ \ \left(8.22\right)\\\mathbf{b}_{3} & =0\mathbf{b}_{1}+0\mathbf{b}_{2}+1\mathbf{b}_{3}.\ \ \ \ \ \ \ \ \ \ \left(8.23\right)\end{aligned}
Since equality of linear combinations implies equality of coefficients, we arrive at the following nine identities:
B1iCi1=1B1iCi2=0B1iCi3=0B2iCi1=0B2iCi2=1B2iCi3=0B3iCi1=0B3iCi2=0B3iCi3=1.(8.24)\begin{array} {lll} B_{1}^{i}C_{i}^{1}=1 & B_{1}^{i}C_{i}^{2}=0 & B_{1}^{i}C_{i}^{3}=0\\ B_{2}^{i}C_{i}^{1}=0 & B_{2}^{i}C_{i}^{2}=1 & B_{2}^{i}C_{i}^{3}=0\\ B_{3}^{i}C_{i}^{1}=0 & B_{3}^{i}C_{i}^{2}=0 & B_{3}^{i}C_{i}^{3}=1. \end{array}\tag{8.24}
Finally, note that with the help of the indicial notation and the Kronecker delta δjk\delta_{j}^{k}, these nine identities can be captured by the single tensor equation
BjiCik=δjk.(8.25)B_{j}^{i}C_{i}^{k}=\delta_{j}^{k}.\tag{8.25}
We are now able to formulate our conclusion. Since the Kronecker delta corresponds to the identity matrix, this identity states precisely that the matrices corresponding to BjiB_{j}^{i} and CjiC_{j}^{i} are the inverses of the other, as we set out to show. The advantage of the approach that we have just described is that it exposed almost every elementary low-level detail of the calculation. On the other hand, its exhaustive details may have obscured the simple algebraic structure of the equations. This approach therefore cannot be considered an effective use of the tensor notation.
The tensor-expert approach brings out the elegance of the argument by keeping the equations compact. In the equation
bj=BjiCikbk,(8.16)\mathbf{b}_{j}=B_{j}^{i}C_{i}^{k}\mathbf{b}_{k}, \tag{8.16}
replace bj\mathbf{b}_{j} with the equivalent expression δjkbk\delta_{j} ^{k}\mathbf{b}_{k}. The resulting identity
δjkbk=BjiCikbk(8.26)\delta_{j}^{k}\mathbf{b}_{k}=B_{j}^{i}C_{i}^{k}\mathbf{b}_{k}\tag{8.26}
shows the equivalence of the two linear combinations with respect to the same basis, which is precisely the situation we described earlier. Equating the coefficients, we find
δjk=BjiCik.(8.27)\delta_{j}^{k}=B_{j}^{i}C_{i}^{k}.\tag{8.27}
Switching the sides, i.e.
BjiCik=δjk,(8.25)B_{j}^{i}C_{i}^{k}=\delta_{j}^{k}, \tag{8.25}
we immediately recognize it as precisely the equation we set out to prove.
Recall from Chapter 2, that for two vectors U\mathbf{U} and V\mathbf{V}, the matrix form of the component space expression for UV\mathbf{U}\cdot\mathbf{V} reads
UV=UTMV,(2.72)\mathbf{U}\cdot\mathbf{V}=U^{T}MV, \tag{2.72}
where UU and VV are the n×1n\times1 matrices representing the components of U\mathbf{U} and V\mathbf{V}, and MM is the n×nn\times n matrix of pairwise dot products of the basis vectors bi\mathbf{b}_{i}, i.e.
M=[b1b1b1b2b1b3b2b1b2b2b2b3b3b1b3b2b3b3].(2.53)M=\left[ \begin{array} {ccc} \mathbf{b}_{1}\cdot\mathbf{b}_{1} & \mathbf{b}_{1}\cdot\mathbf{b}_{2} & \mathbf{b}_{1}\cdot\mathbf{b}_{3}\\ \mathbf{b}_{2}\cdot\mathbf{b}_{1} & \mathbf{b}_{2}\cdot\mathbf{b}_{2} & \mathbf{b}_{2}\cdot\mathbf{b}_{3}\\ \mathbf{b}_{3}\cdot\mathbf{b}_{1} & \mathbf{b}_{3}\cdot\mathbf{b}_{2} & \mathbf{b}_{3}\cdot\mathbf{b}_{3} \end{array} \right] . \tag{2.53}
As elegant as the equation
UV=UTMV(2.72)\mathbf{U}\cdot\mathbf{V}=U^{T}MV \tag{2.72}
may be, its tensor analogue is every bit as aesthetically pleasing. In fact, with the introduction of index juggling in Chapter 11, it will become even more so.
The entries of the matrix MM are naturally enumerated by a pair of subscripts, i.e.
Mij=bibj.(8.28)M_{ij}=\mathbf{b}_{i}\cdot\mathbf{b}_{j}.\tag{8.28}
Once again, the choice of subscripts is dictated strictly by the rules of the tensor notation rather than some a priori insight into the nature of the system MijM_{ij} or knowledge of how it transforms under a change of basis. The tensor notation stipulates that all terms in an identity must have matching indicial signatures. Thus, the fact that the expression bibj\mathbf{b}_{i}\cdot\mathbf{b}_{j} has two subscripts means that we have no choice but to enumerate the entries of MM by subscripts. The fact that this choice accurately predicts the manner in which MijM_{ij} transforms under a change of basis, which will be confirmed in Chapter 14, is simply yet another illustration of the great predictive ability of the tensor notation.
Next, following the convention adopted earlier, which was also dictated by the rules of the tensor notation, enumerate the components UiU^{i} and ViV^{i} of U\mathbf{U} and V\mathbf{V} by superscripts, i.e.
U=Uibi   and   V=Vibi.(8.29)\mathbf{U}=U^{i}\mathbf{b}_{i}\text{ \ \ and\ \ \ }\mathbf{V}=V^{i} \mathbf{b}_{i}.\tag{8.29}
We will now show that in terms of MijM_{ij}, UiU^{i}, and ViV^{i}, the expression for UV\mathbf{U}\cdot\mathbf{V} reads
UV=MijUiVj.(8.30)\mathbf{U}\cdot\mathbf{V}=M_{ij}U^{i}V^{j}.\tag{8.30}
First, let us convince ourselves that this formula is correct by fully unpacking the contractions on the right that represent a sum of nine terms, i.e.
UV=b1b1U1V1+b1b2U1V2+b1b3U1V3               +b2b1U2V1+b2b2U2V2+b2b3U2V3          (8.31)            +b3b1U3V1+b3b2U3V2+b3b3U3V3.          \begin{aligned}\mathbf{U}\cdot\mathbf{V} & =\mathbf{b}_{1}\cdot\mathbf{b}_{1}U^{1} V^{1}+\mathbf{b}_{1}\cdot\mathbf{b}_{2}U^{1}V^{2}+\mathbf{b}_{1} \cdot\mathbf{b}_{3}U^{1}V^{3}\ \ \ \ \ \ \ \ \ \ \\& \ \ \ \ \ +\mathbf{b}_{2}\cdot\mathbf{b}_{1}U^{2}V^{1}+\mathbf{b}_{2} \cdot\mathbf{b}_{2}U^{2}V^{2}+\mathbf{b}_{2}\cdot\mathbf{b}_{3}U^{2}V^{3}\ \ \ \ \ \ \ \ \ \ \left(8.31\right)\\& \ \ \ \ \ \ \ \ \ \ \ \ +\mathbf{b}_{3}\cdot\mathbf{b}_{1}U^{3} V^{1}+\mathbf{b}_{3}\cdot\mathbf{b}_{2}U^{3}V^{2}+\mathbf{b}_{3} \cdot\mathbf{b}_{3}U^{3}V^{3}.\ \ \ \ \ \ \ \ \ \ \end{aligned}
This equations is identical to the equation
UV=U1V1b1b1+U1V2b1b2+U1V3b1b3               +U2V1b2b1+U2V2b2b2+U2V3b2b3          (2.69)            +U3V1b3b1+U3V2b3b2+U3V3b3b3          \begin{aligned}\mathbf{U}\cdot\mathbf{V} & =U_{1}V_{1}\mathbf{b}_{1}\cdot\mathbf{b} _{1}+U_{1}V_{2}\mathbf{b}_{1}\cdot\mathbf{b}_{2}+U_{1}V_{3}\mathbf{b}_{1} \cdot\mathbf{b}_{3}\ \ \ \ \ \ \ \ \ \ \\& \ \ \ \ \ +U_{2}V_{1}\mathbf{b}_{2}\cdot\mathbf{b}_{1}+U_{2}V_{2} \mathbf{b}_{2}\cdot\mathbf{b}_{2}+U_{2}V_{3}\mathbf{b}_{2}\cdot\mathbf{b} _{3}\ \ \ \ \ \ \ \ \ \ \left(2.69\right)\\& \ \ \ \ \ \ \ \ \ \ \ \ +U_{3}V_{1}\mathbf{b}_{3}\cdot\mathbf{b}_{1} +U_{3}V_{2}\mathbf{b}_{3}\cdot\mathbf{b}_{2}+U_{3}V_{3}\mathbf{b}_{3} \cdot\mathbf{b}_{3}\ \ \ \ \ \ \ \ \ \ \end{aligned}
derived in Chapter 2, except for the fact that the components of the vectors are now enumerated by superscripts. Thus, the combination MijUiVjM_{ij}U^{i}V^{j} indeed represents the dot product UV\mathbf{U} \cdot\mathbf{V}.
Thanks to the summation convention, the expression MijUiVjM_{ij}U^{i}V^{j} is every bit as compact as its matrix analogue UTMVU^{T}MV. Furthermore, the indicial form offers a few notational advantages over the matrix form. First, it does not require the operation of the transpose. As we discussed in the previous Chapter, this speaks to the extreme economy of operations in Tensor Calculus. Second, as we also discussed in the previous Chapter, the order of the multiplicative terms in the expression MijUiVjM_{ij}U^{i}V^{j} is immaterial. Third, the expression MijUiVjM_{ij}U^{i}V^{j} gives access to the individual entries of the matrix MM as well as the individual components of U\mathbf{U} and V\mathbf{V}. This will prove to be of crucial advantage in numerous applications, including quadratic form minimization discussed below. Finally, note that with the help of index juggling introduced in Chapter 11, the expression MijUiVjM_{ij}U^{i}V^{j} will be supplanted by the remarkably compact equivalent UiViU_{i}V^{i} which, while valid for all bases, exhibits the utmost simplicity of the dot product expressed with respect to an orthonormal basis.
Next, let us re-derive the identity
UV=MijUiVj(8.30)\mathbf{U}\cdot\mathbf{V}=M_{ij}U^{i}V^{j}\tag{8.30}
strictly in the tensor notation, as opposed to by matching up MijUiVjM_{ij} U^{i}V^{j} with a previously established result. On the one hand, the upcoming calculation is almost too simple to be called a derivation. On the other hand, it does require a careful manipulation of indices and will therefore serve as a worthwhile exercise of the tensor notation.
Start with the decompositions of U\mathbf{U} and V\mathbf{V} in terms of the basis bi\mathbf{b}_{i}, i.e.
U=Uibi   and   V=Vibi.(8.29)\mathbf{U}=U^{i}\mathbf{b}_{i}\text{ \ \ and\ \ \ }\mathbf{V}=V^{i} \mathbf{b}_{i}. \tag{8.29}
Since we are about to combine these expressions in a single product, they cannot both use the index ii -- otherwise, we would end up with an invalid combination UibiVibiU^{i}\mathbf{b}_{i}\cdot V^{i}\mathbf{b}_{i}. Thus, we will keep ii in the expression for U\mathbf{U} and switch to jj in the expression for V\mathbf{V}, i.e.
U=Uibi   and   V=Vjbj.(8.32)\mathbf{U}=U^{i}\mathbf{b}_{i}\text{ \ \ and\ \ \ }\mathbf{V}=V^{j} \mathbf{b}_{j}.\tag{8.32}
Dotting the two identities, we find
UV=UibiVjbj.(8.33)\mathbf{U}\cdot\mathbf{V}=U^{i}\mathbf{b}_{i}\cdot V^{j}\mathbf{b}_{j}.\tag{8.33}
(Recall from Section 7.8.2 the discussion concerning the subtleties inherent in expressions that feature two or more simultaneous contractions.) Rearrange the terms on the right to bring the two vectors together, i.e.
UV=bibjUiVj.(8.34)\mathbf{U}\cdot\mathbf{V}=\mathbf{b}_{i}\cdot\mathbf{b}_{j}U^{i}V^{j}.\tag{8.34}
Since
Mij=bibj.(8.28)M_{ij}=\mathbf{b}_{i}\cdot\mathbf{b}_{j}. \tag{8.28}
we arrive at the desired result
UV=MijUiVj.(8.30)\mathbf{U}\cdot\mathbf{V}=M_{ij}U^{i}V^{j}. \tag{8.30}
The foregoing discussion is important for two reasons. First, the dot product is a central operation in Geometry, and therefore in Tensor Calculus, and its component space representation -- later to be referred to as the coordinate space representation -- is of utmost value. Second, the discussion illustrated that the tensor notation is an effective tool for deriving algebraic relationships. Note that in Chapter 2, we essentially guessed the equation
UV=UTMV(2.72)\mathbf{U}\cdot\mathbf{V}=U^{T}MV\tag{2.72}
and subsequently observed its correctness. In this Section, with the help of the tensor notation, we were able to arrive the equivalent equation
UV=MijUiVj(8.30)\mathbf{U}\cdot\mathbf{V}=M_{ij}U^{i}V^{j}\tag{8.30}
by straightforward algebraic manipulation.
Many of the fundamental identities in Tensor Calculus are obtained by differentiating identities involving composite functions. As a result, such analyses rely heavily on the use of the chain rule. Fortunately, as we are about to demonstrate, the multivariate chain rule lends itself perfectly to the tensor notation and illustrates another one of its natural applications.

8.5.1The case of one independent variable

We will begin with a function H(t)H\left( t\right) of a single variable given by the composition
H(t)=F(a(t),b(t),c(t)),(8.35)H\left( t\right) =F\left( a\left( t\right) ,b\left( t\right) ,c\left( t\right) \right) ,\tag{8.35}
where F(x,y,z)F\left( x,y,z\right) is a function of three variables, and each of a(t)a\left( t\right) , b(t)b\left( t\right) , and c(t)c\left( t\right) are functions of tt. According to the chain rule, the derivative H(t)H^{\prime}\left( t\right) is given in terms of the partial derivatives of F(x,y,z)F\left( x,y,z\right) and the ordinary derivatives of a(t)a\left( t\right) , b(t)b\left( t\right) , and c(t)c\left( t\right) by the identity
dHdt=Fxdadt+Fydbdt+Fzdcdt.(8.36)\frac{dH}{dt}=\frac{\partial F}{\partial x}\frac{da}{dt}+\frac{\partial F}{\partial y}\frac{db}{dt}+\frac{\partial F}{\partial z}\frac{dc}{dt}.\tag{8.36}
Note the convention that we will use throughout our narrative of suppressing the arguments of functions when they are clear from the context. With the full detail of the arguments included, the above equation would read
dH(t)dt=F(x,y,z)xx=a(t),y=b(t),z=c(t)da(t)dt          +F(x,y,z)yx=a(t),y=b(t),z=c(t)db(t)dt          (8.37)+F(x,y,z)zx=a(t),y=b(t),z=c(t)dc(t)dt,          \begin{aligned}\frac{dH\left( t\right) }{dt} & =\left. \frac{\partial F\left( x,y,z\right) }{\partial x}\right\vert _{x=a\left( t\right) ,y=b\left( t\right) ,z=c\left( t\right) }\frac{da\left( t\right) }{dt}\ \ \ \ \ \ \ \ \ \ \\& \hspace{0.5in}+\left. \frac{\partial F\left( x,y,z\right) }{\partial y}\right\vert _{x=a\left( t\right) ,y=b\left( t\right) ,z=c\left( t\right) }\frac{db\left( t\right) }{dt}\ \ \ \ \ \ \ \ \ \ \left(8.37\right)\\& \hspace{0.5in}\hspace{0.5in}+\left. \frac{\partial F\left( x,y,z\right) }{\partial z}\right\vert _{x=a\left( t\right) ,y=b\left( t\right) ,z=c\left( t\right) }\frac{dc\left( t\right) }{dt},\ \ \ \ \ \ \ \ \ \ \end{aligned}
which makes it evident why we prefer
dHdt=Fxdadt+Fydbdt+Fzdcdt.(8.36)\frac{dH}{dt}=\frac{\partial F}{\partial x}\frac{da}{dt}+\frac{\partial F}{\partial y}\frac{db}{dt}+\frac{\partial F}{\partial z}\frac{dc}{dt}. \tag{8.36}
Our present goal is to express the right side in the tensor notation. Fortunately, being a sum of products, it is ready to be interpreted as a contraction. To this end, denote the arguments of FF by z1z^{1}, z2z^{2} , and z3z^{3}, turning F(a,b,c)F\left( a,b,c\right) into F(z1,z2,z3)F\left( z^{1} ,z^{2},z^{3}\right) . Also, denote the functions of a(t)a\left( t\right) , b(t)b\left( t\right) , and c(t)c\left( t\right) by c1(t)c^{1}\left( t\right) , c2(t)c^{2}\left( t\right) , and c3(t)c^{3}\left( t\right) . In terms of the new symbols, H(t)H\left( t\right) is given by
H(t)=F(c1(t),c2(t),c3(t))(8.38)H\left( t\right) =F\left( c^{1}\left( t\right) ,c^{2}\left( t\right) ,c^{3}\left( t\right) \right)\tag{8.38}
while the expression for its derivative reads
dHdt=Fz1dc1dt+Fz2dc2dt+Fz3dc3dt.(8.39)\frac{dH}{dt}=\frac{\partial F}{\partial z^{1}}\frac{dc^{1}}{dt} +\frac{\partial F}{\partial z^{2}}\frac{dc^{2}}{dt}+\frac{\partial F}{\partial z^{3}}\frac{dc^{3}}{dt}.\tag{8.39}
Now the expression on the right can be easily captured with the help of the summation convention, i.e.
dHdt=Fzidcidt.(8.40)\frac{dH}{dt}=\frac{\partial F}{\partial z^{i}}\frac{dc^{i}}{dt}.\tag{8.40}
As we discussed at the end of Section 7.3, the index ii in the symbol F/zi\partial F/\partial z^{i} can be thought of as a subscript because it is a superscript in the "denominator" of a "fraction". Therefore, the repeated index appears once as a subscript and once as a superscript, thus properly triggering Einstein's summation convention.
Finally, if you still find it helpful to visualize the matrix form of indicial expressions, note that the above identity can be captured by the equation
dH(t)dt=[Fz1Fz2Fz3][dc1dtdc2dtdc3dt].(8.41)\frac{dH\left( t\right) }{dt}= \begin{array} {c} \left[ \begin{array} {ccc} \frac{\partial F}{\partial z^{1}} & \frac{\partial F}{\partial z^{2}} & \frac{\partial F}{\partial z^{3}} \end{array} \right] \\ \\ \\ \end{array} \left[ \begin{array} {c} \frac{dc^{1}}{dt}\\ \frac{dc^{2}}{dt}\\ \frac{dc^{3}}{dt} \end{array} \right] .\tag{8.41}
We will continue to provide the corresponding matrix forms for each of the chain rule identities in this Section.

8.5.2The case of several independent variables

Let us now consider a function H(u,v)H\left( u,v\right) of two variables formed by composing the function F(x,y,z)F\left( x,y,z\right) with functions of two variables a(u,v)a\left( u,v\right) , b(u,v)b\left( u,v\right) , and c(u,v)c\left( u,v\right) , i.e.
H(u,v)=F(a(u,v),b(u,v),c(u,v)).(8.42)H\left( u,v\right) =F\left( a\left( u,v\right) ,b\left( u,v\right) ,c\left( u,v\right) \right) .\tag{8.42}
Even though in this example there are only two independent variables, our analysis will apply to functions with an arbitrary number of arguments.
Applying the chain rule for each independent variable, we find
Hu=Fxau+Fybu+Fzcu          (8.43)Hv=Fxav+Fybv+Fzcv.          (8.44)\begin{aligned}\frac{\partial H}{\partial u} & =\frac{\partial F}{\partial x}\frac{\partial a}{\partial u}+\frac{\partial F}{\partial y}\frac{\partial b}{\partial u}+\frac{\partial F}{\partial z}\frac{\partial c}{\partial u}\ \ \ \ \ \ \ \ \ \ \left(8.43\right)\\\frac{\partial H}{\partial v} & =\frac{\partial F}{\partial x}\frac{\partial a}{\partial v}+\frac{\partial F}{\partial y}\frac{\partial b}{\partial v}+\frac{\partial F}{\partial z}\frac{\partial c}{\partial v}.\ \ \ \ \ \ \ \ \ \ \left(8.44\right)\end{aligned}
Let us convert these identities into a single indicial equation. In addition to denoting the independent variables of FF by ziz^{i} and the functions aa, bb, and cc by cic^{i}, denote the independent variables as u1u^{1} and u2u^{2}, i.e.
H(u1,u2)=F(c1(u1,u2),c2(u1,u2),c3(u1,u2)).(8.45)H\left( u^{1},u^{2}\right) =F\left( c^{1}\left( u^{1},u^{2}\right) ,c^{2}\left( u^{1},u^{2}\right) ,c^{3}\left( u^{1},u^{2}\right) \right) .\tag{8.45}
Collectively, we will refer to u1u^{1} and u2u^{2} as uαu^{\alpha}. The superscript α\alpha is taken from a different alphabet to highlight the fact that the number of independent variables uαu^{\alpha} is different from the number of the arguments of FF. In terms of the new symbols, the above differential identities read
Hu1=Fz1c1u1+Fz2c2u1+Fz3c3u1          (8.46)Hu2=Fz1c1u2+Fz2c2u2+Fz3c3u2.          (8.47)\begin{aligned}\frac{\partial H}{\partial u^{1}} & =\frac{\partial F}{\partial z^{1}} \frac{\partial c^{1}}{\partial u^{1}}+\frac{\partial F}{\partial z^{2}} \frac{\partial c^{2}}{\partial u^{1}}+\frac{\partial F}{\partial z^{3}} \frac{\partial c^{3}}{\partial u^{1}}\ \ \ \ \ \ \ \ \ \ \left(8.46\right)\\\frac{\partial H}{\partial u^{2}} & =\frac{\partial F}{\partial z^{1}} \frac{\partial c^{1}}{\partial u^{2}}+\frac{\partial F}{\partial z^{2}} \frac{\partial c^{2}}{\partial u^{2}}+\frac{\partial F}{\partial z^{3}} \frac{\partial c^{3}}{\partial u^{2}}.\ \ \ \ \ \ \ \ \ \ \left(8.47\right)\end{aligned}
We can now "pack" these identities into the single indicial equation
Huα=Fziciuα.(8.48)\frac{\partial H}{\partial u^{\alpha}}=\frac{\partial F}{\partial z^{i}} \frac{\partial c^{i}}{\partial u^{\alpha}}.\tag{8.48}
As usual, the repeated index represents a summation while the free index enumerates independent equations. Note that the object ci/uα\partial c^{i}/\partial u^{\alpha} represents six partial derivatives: the derivative of each of the three functions ci(u1,u2)c^{i}\left( u^{1},u^{2}\right) with respect to each of the two variables uαu^{\alpha}. If the Latin index is treated as first and the Greek as second, the above identity can be captured in matrix form by the equation
[Hu1Hu2]=[Fz1Fz2Fz3][c1u1c1u2c2u1c2u2c3u1c3u2].(8.49)\left[ \begin{array} {cc} \frac{\partial H}{\partial u^{1}} & \frac{\partial H}{\partial u^{2}} \end{array} \right] = \begin{array} {c} \left[ \begin{array} {ccc} \frac{\partial F}{\partial z^{1}} & \frac{\partial F}{\partial z^{2}} & \frac{\partial F}{\partial z^{3}} \end{array} \right] \\ \\ \\ \end{array} \left[ \begin{array} {cc} \frac{\partial c^{1}}{\partial u^{1}} & \frac{\partial c^{1}}{\partial u^{2} }\\ \frac{\partial c^{2}}{\partial u^{1}} & \frac{\partial c^{2}}{\partial u^{2} }\\ \frac{\partial c^{3}}{\partial u^{1}} & \frac{\partial c^{3}}{\partial u^{2}} \end{array} \right] .\tag{8.49}
A note on terminology is in order. The phrase differentiation with respect to uαu^{\alpha} refers to the evaluation of the partial derivatives of a function with respect to each of the independent variables uαu^{\alpha}. However, from the point of view of the mechanics of differentiation, the derivatives are evaluated as if with respect to a single variable, such as u1u^{1} or u2u^{2}. In other words, thanks to the tensor notation, the simultaneous nature of the operation does not increase the complexity of the analysis compared to the evaluation of a single derivative. As a matter of basic tensor proficiency, you should be able to go fluently from the equation
H(u1,u2)=F(c1(u1,u2),c2(u1,u2),c3(u1,u2))(8.45)H\left( u^{1},u^{2}\right) =F\left( c^{1}\left( u^{1},u^{2}\right) ,c^{2}\left( u^{1},u^{2}\right) ,c^{3}\left( u^{1},u^{2}\right) \right) \tag{8.45}
defining H(u1,u2)H\left( u^{1},u^{2}\right) in terms of FF and cic^{i} to the equation
Huα=Fziciuα(8.48)\frac{\partial H}{\partial u^{\alpha}}=\frac{\partial F}{\partial z^{i}} \frac{\partial c^{i}}{\partial u^{\alpha}} \tag{8.48}
that gives its derivatives in terms of the derivatives of FF and cic^{i}.
We will now turn our attention to the most general case of several functions of several variables.

8.5.3The case of several functions of several independent variables

Finally, let us consider the general case of several functions FpF^{p} of several variables ziz^{i} composed with a matching number of functions cic^{i} of several variables uαu^{\alpha}. We have run out of alphabets, so we going to use a letter, pp, from a different part of the Latin alphabet to indicate that the number of functions FpF^{p} may be different from the number of arguments ziz^{i} in each function FpF^{p} and from the number of independent variables uαu^{\alpha}. For the sake of concreteness, suppose that there are 44 functions FpF^{p}, i.e. F1F^{1}, F2F^{2}, F3F^{3}, and F4F^{4}, and thus there are 44 composite functions HpH^{p}, i.e.
H1(u1,u2)=F1(c1(u1,u2),c2(u1,u2),c3(u1,u2))          (8.50)H2(u1,u2)=F2(c1(u1,u2),c2(u1,u2),c3(u1,u2))          (8.51)H3(u1,u2)=F3(c1(u1,u2),c2(u1,u2),c3(u1,u2))          (8.52)H4(u1,u2)=F4(c1(u1,u2),c2(u1,u2),c3(u1,u2)).          (8.53)\begin{aligned}H^{1}\left( u^{1},u^{2}\right) & =F^{1}\left( c^{1}\left( u^{1} ,u^{2}\right) ,c^{2}\left( u^{1},u^{2}\right) ,c^{3}\left( u^{1} ,u^{2}\right) \right)\ \ \ \ \ \ \ \ \ \ \left(8.50\right)\\H^{2}\left( u^{1},u^{2}\right) & =F^{2}\left( c^{1}\left( u^{1} ,u^{2}\right) ,c^{2}\left( u^{1},u^{2}\right) ,c^{3}\left( u^{1} ,u^{2}\right) \right)\ \ \ \ \ \ \ \ \ \ \left(8.51\right)\\H^{3}\left( u^{1},u^{2}\right) & =F^{3}\left( c^{1}\left( u^{1} ,u^{2}\right) ,c^{2}\left( u^{1},u^{2}\right) ,c^{3}\left( u^{1} ,u^{2}\right) \right)\ \ \ \ \ \ \ \ \ \ \left(8.52\right)\\H^{4}\left( u^{1},u^{2}\right) & =F^{4}\left( c^{1}\left( u^{1} ,u^{2}\right) ,c^{2}\left( u^{1},u^{2}\right) ,c^{3}\left( u^{1} ,u^{2}\right) \right) .\ \ \ \ \ \ \ \ \ \ \left(8.53\right)\end{aligned}
Pack these equations into a single one with the help of a live index pp, i.e.
Hp(u1,u2)=Fp(c1(u1,u2),c2(u1,u2),c3(u1,u2)).(8.54)H^{p}\left( u^{1},u^{2}\right) =F^{p}\left( c^{1}\left( u^{1} ,u^{2}\right) ,c^{2}\left( u^{1},u^{2}\right) ,c^{3}\left( u^{1} ,u^{2}\right) \right) .\tag{8.54}
Differentiating the combined equation with respect to u1u^{1} and u2u^{2} yields
Hpu1=Fpz1c1u1+Fpz2c2u1+Fpz3c3u1          (8.55)Hpu2=Fpz1c1u2+Fpz2c2u2+Fpz3c3u2.          (8.56)\begin{aligned}\frac{\partial H^{p}}{\partial u^{1}} & =\frac{\partial F^{p}}{\partial z^{1}}\frac{\partial c^{1}}{\partial u^{1}}+\frac{\partial F^{p}}{\partial z^{2}}\frac{\partial c^{2}}{\partial u^{1}}+\frac{\partial F^{p}}{\partial z^{3}}\frac{\partial c^{3}}{\partial u^{1}}\ \ \ \ \ \ \ \ \ \ \left(8.55\right)\\\frac{\partial H^{p}}{\partial u^{2}} & =\frac{\partial F^{p}}{\partial z^{1}}\frac{\partial c^{1}}{\partial u^{2}}+\frac{\partial F^{p}}{\partial z^{2}}\frac{\partial c^{2}}{\partial u^{2}}+\frac{\partial F^{p}}{\partial z^{3}}\frac{\partial c^{3}}{\partial u^{2}}.\ \ \ \ \ \ \ \ \ \ \left(8.56\right)\end{aligned}
These equations represent a total of 8=4×28=4\times2 identities, as each equation represents 44 identities corresponding to p=1p=1, 22, 33, and 44. Express the contractions on the right by using a dummy index ii, i.e.
Hpu1=Fpziciu1          (8.57)Hpu2=Fpziciu2,          (8.58)\begin{aligned}\frac{\partial H^{p}}{\partial u^{1}} & =\frac{\partial F^{p}}{\partial z^{i}}\frac{\partial c^{i}}{\partial u^{1}}\ \ \ \ \ \ \ \ \ \ \left(8.57\right)\\\frac{\partial H^{p}}{\partial u^{2}} & =\frac{\partial F^{p}}{\partial z^{i}}\frac{\partial c^{i}}{\partial u^{2}},\ \ \ \ \ \ \ \ \ \ \left(8.58\right)\end{aligned}
and, subsequently, combine the two equations into a single one by using a free index α\alpha , i.e.
Hpuα=Fpziciuα.(8.59)\frac{\partial H^{p}}{\partial u^{\alpha}}=\frac{\partial F^{p}}{\partial z^{i}}\frac{\partial c^{i}}{\partial u^{\alpha}}.\tag{8.59}
This single equation captures the 88 partial derivatives Hp/uα\partial H^{p}/\partial u^{\alpha} of the functions Hp(u1,υ2)H^{p}\left( u^{1},\upsilon ^{2}\right) in terms of the partial derivatives of FpF^{p} and cic^{i}. The application of the chain rule to multivariate composite functions will be one of the most common operations going forward.
Finally, let us give the matrix form of this equation. If the superscript enumerating the functions is considered first and the subscript enumerating the variables second, then the corresponding equation in matrix form reads
[H1u1H1u2H2u1H2u2H3u1H3u2H4u1H4u2]=[F1z1F1z2F1z3F2z1F2z2F2z3F3z1F3z2F3z3F4z1F4z2F4z3][c1u1c1u2c2u1c2u2c3u1c3u2].(8.60)\left[ \begin{array} {cc} \frac{\partial H^{1}}{\partial u^{1}} & \frac{\partial H^{1}}{\partial u^{2} }\\ \frac{\partial H^{2}}{\partial u^{1}} & \frac{\partial H^{2}}{\partial u^{2} }\\ \frac{\partial H^{3}}{\partial u^{1}} & \frac{\partial H^{3}}{\partial u^{2} }\\ \frac{\partial H^{4}}{\partial u^{1}} & \frac{\partial H^{4}}{\partial u^{2}} \end{array} \right] =\left[ \begin{array} {ccc} \frac{\partial F^{1}}{\partial z^{1}} & \frac{\partial F^{1}}{\partial z^{2}} & \frac{\partial F^{1}}{\partial z^{3}}\\ \frac{\partial F^{2}}{\partial z^{1}} & \frac{\partial F^{2}}{\partial z^{2}} & \frac{\partial F^{2}}{\partial z^{3}}\\ \frac{\partial F^{3}}{\partial z^{1}} & \frac{\partial F^{3}}{\partial z^{2}} & \frac{\partial F^{3}}{\partial z^{3}}\\ \frac{\partial F^{4}}{\partial z^{1}} & \frac{\partial F^{4}}{\partial z^{2}} & \frac{\partial F^{4}}{\partial z^{3}} \end{array} \right] \left[ \begin{array} {cc} \frac{\partial c^{1}}{\partial u^{1}} & \frac{\partial c^{1}}{\partial u^{2} }\\ \frac{\partial c^{2}}{\partial u^{1}} & \frac{\partial c^{2}}{\partial u^{2} }\\ \frac{\partial c^{3}}{\partial u^{1}} & \frac{\partial c^{3}}{\partial u^{2}} \end{array} \right] .\tag{8.60}
The natural ability of the tensor notation to handle the chain rule pays immediate dividends in the analysis of inverse functions, to which we now turn.
As we have already mentioned on a number of occasions, the tensor property describes how a system transforms under a change of coordinates. A change of coordinates is, in turn, specified by two sets of inverse functions. In this Section, we will derive the relationship between the partial derivatives of those sets of functions. This exercise will serve the dual purpose of preparing us for future analyses of coordinate changes as well as increasing our fluency with the tensor notation.

8.6.1Coordinate transformations as inverse functions

To describe a change of coordinates in an nn-dimensional space requires a set of nn functions of nn variables. Indeed, we must specify how each of the nn new coordinates is obtained from the nn old coordinates. For example, the functions that describe the transformation from Cartesian coordinates to polar coordinates are
r(x,y)=x2+y2          (8.61)θ(x,y)=arctan(x,y)          (8.62)\begin{aligned}r\left( x,y\right) & =\sqrt{x^{2}+y^{2}}\ \ \ \ \ \ \ \ \ \ \left(8.61\right)\\\theta\left( x,y\right) & =\arctan\left( x,y\right)\ \ \ \ \ \ \ \ \ \ \left(8.62\right)\end{aligned}
Naturally, the inverse coordinate transformation -- from the new coordinates back to the old coordinates -- is also described by a set of nn functions of nn variables. For the same coordinate transformation, the functions that describe the inverse transformation are
x(r,θ)=rcosθ          (8.63)y(r,θ)=rsinθ.          (8.64)\begin{aligned}x\left( r,\theta\right) & =r\cos\theta\ \ \ \ \ \ \ \ \ \ \left(8.63\right)\\y\left( r,\theta\right) & =r\sin\theta.\ \ \ \ \ \ \ \ \ \ \left(8.64\right)\end{aligned}
Thus, we have two sets of functions, i.e. those that translate from old coordinates to new and those that translate from new to old. By definition, the two sets are function inverses of each other. In other words, if one set maps the values a1,,ana^{1},\cdots,a^{n} to b1,,bnb^{1},\cdots,b^{n}, then the other sends b1,,bnb^{1},\cdots,b^{n} back to a1,,ana^{1},\cdots,a^{n}.
Remaining in two dimensions for now, denote the functions that translate the old coordinates to the new coordinates by F1(u,v)F^{1}\left( u,v\right) and F2(u,v)F^{2}\left( u,v\right) or, collectively, Fi(u,v)F^{i}\left( u,v\right) . Denote the functions that translate in the opposite direction by G1(u,v)G^{1}\left( u,v\right) and G2(u,v)G^{2}\left( u,v\right) or, collectively, Gi(u,v)G^{i}\left( u,v\right) . For the transformation between Cartesian and polar coordinates, we have
F1(u,v)=u2+v2F2(u,v)=arctan(u,v)andG1(u,v)=ucosvG2(u,v)=usinv.(8.65)\begin{array} {llll} F^{1}\left( u,v\right) =\sqrt{u^{2}+v^{2}} & & & \\ F^{2}\left( u,v\right) =\arctan\left( u,v\right) & & & \end{array} \text{and} \begin{array} {llll} & & & G^{1}\left( u,v\right) =u\cos v\\ & & & G^{2}\left( u,v\right) =u\sin v. \end{array}\tag{8.65}
Let us confirm that these sets of functions are indeed the inverses of each other by evaluating the composite functions
G1(F1(u,v),F2(u,v)) and G2(F1(u,v),F2(u,v)).(8.66)G^{1}\left( F^{1}\left( u,v\right) ,F^{2}\left( u,v\right) \right) \text{ and }G^{2}\left( F^{1}\left( u,v\right) ,F^{2}\left( u,v\right) \right) .\tag{8.66}
We expect G1(F1(u,v),F2(u,v))G^{1}\left( F^{1}\left( u,v\right) ,F^{2}\left( u,v\right) \right) to recover uu and G2(F1(u,v),F2(u,v))G^{2}\left( F^{1}\left( u,v\right) ,F^{2}\left( u,v\right) \right) to recover vv. Substituting F1(u,v)F^{1}\left( u,v\right) and F2(u,v)F^{2}\left( u,v\right) for the arguments of G1(u,v)G^{1}\left( u,v\right) and G2(u,v)G^{2}\left( u,v\right) , we find
G1(F1(u,v),F2(u,v))=u2+v2cosarctan(u,v)          (8.67)G2(F1(u,v),F2(u,v))=u2+v2sinarctan(u,v).          (8.68)\begin{aligned}G^{1}\left( F^{1}\left( u,v\right) ,F^{2}\left( u,v\right) \right) & =\sqrt{u^{2}+v^{2}}\cos\arctan\left( u,v\right)\ \ \ \ \ \ \ \ \ \ \left(8.67\right)\\G^{2}\left( F^{1}\left( u,v\right) ,F^{2}\left( u,v\right) \right) & =\sqrt{u^{2}+v^{2}}\sin\arctan\left( u,v\right) .\ \ \ \ \ \ \ \ \ \ \left(8.68\right)\end{aligned}
It is left as an exercise to show that
cosarctan(u,v)=uu2+v2 and          (8.69)sinarctan(u,v)=vu2+v2,          (8.70)\begin{aligned}\cos\arctan\left( u,v\right) & =\frac{u}{\sqrt{u^{2}+v^{2}}}\text{ and}\ \ \ \ \ \ \ \ \ \ \left(8.69\right)\\\sin\arctan\left( u,v\right) & =\frac{v}{\sqrt{u^{2}+v^{2}}} ,\ \ \ \ \ \ \ \ \ \ \left(8.70\right)\end{aligned}
and therefore
G1(F1(u,v),F2(u,v))=u2+v2uu2+v2=u          (8.71)G2(F1(u,v),F2(u,v))=u2+v2vu2+v2=v,          (8.72)\begin{aligned}G^{1}\left( F^{1}\left( u,v\right) ,F^{2}\left( u,v\right) \right) & =\sqrt{u^{2}+v^{2}}\frac{u}{\sqrt{u^{2}+v^{2}}}=u\ \ \ \ \ \ \ \ \ \ \left(8.71\right)\\G^{2}\left( F^{1}\left( u,v\right) ,F^{2}\left( u,v\right) \right) & =\sqrt{u^{2}+v^{2}}\frac{v}{\sqrt{u^{2}+v^{2}}}=v,\ \ \ \ \ \ \ \ \ \ \left(8.72\right)\end{aligned}
as we set us to show. It is left as an exercise to show that composing the functions in the opposite way yields the same result, i.e.
F1(G1(u,v),G2(u,v))=u          (8.73)F2(G1(u,v),G2(u,v))=v.          (8.74)\begin{aligned}F^{1}\left( G^{1}\left( u,v\right) ,G^{2}\left( u,v\right) \right) & =u\ \ \ \ \ \ \ \ \ \ \left(8.73\right)\\F^{2}\left( G^{1}\left( u,v\right) ,G^{2}\left( u,v\right) \right) & =v.\ \ \ \ \ \ \ \ \ \ \left(8.74\right)\end{aligned}
In nn dimensions, the collection of the partial derivatives of the functions FiF^{i} with respect each of their independent variables is a second-order system with n2n^{2} elements. The same is true of the partial derivatives of GiG^{i} with respect to each of their independent variables. The two systems are connected by an extraordinarily elegant relationship which we are about to derive. We will discover that the tensor notation, and its particular effectiveness in expressing the chain rule, truly shines in this application.

8.6.2Functions of one variable

Let us start our discussion with functions of one variable. Consider two ordinary functions F(x)F\left( x\right) and G(x)G\left( x\right) that are the inverses of each other, i.e. if F(x)F\left( x\right) maps the number aa to the number bb, then G(x)G\left( x\right) maps bb back to aa. In other words, if
F(a)=b(8.75)F\left( a\right) =b\tag{8.75}
then
G(b)=a.(8.76)G\left( b\right) =a.\tag{8.76}
Here are a few examples of inverse functions that we may use to confirm the derived relationships:
F(x)=x2andG(x)=xF(x)=exandG(x)=lnxF(x)=sinxandG(x)=arcsinx.(8.77)\begin{array} {lllll} F\left( x\right) =x^{2} & & \text{and} & & G\left( x\right) =\sqrt{x}\\ F\left( x\right) =e^{x} & & \text{and} & & G\left( x\right) =\ln x\\ F\left( x\right) =\sin x & & \text{and} & & G\left( x\right) =\arcsin x. \end{array}\tag{8.77}
Let us also document their derivatives:
F(x)=2xandG(x)=12xF(x)=exandG(x)=1xF(x)=cosxandG(x)=11x2.(8.78)\begin{array} {lllll} F^{\prime}\left( x\right) =2x & & \text{and} & & G^{\prime}\left( x\right) =\frac{1}{2\sqrt{x}}\\ F^{\prime}\left( x\right) =e^{x} & & \text{and} & & G^{\prime}\left( x\right) =\frac{1}{x}\\ F^{\prime}\left( x\right) =\cos x & & \text{and} & & G^{\prime}\left( x\right) =\frac{1}{\sqrt{1-x^{2}}}. \end{array}\tag{8.78}
While, on the face of it, the derivatives F(x)F^{\prime}\left( x\right) and G(x)G^{\prime}\left( x\right) do not appear to be related in an obvious way, you may remember from ordinary Calculus that they are, in a certain sense, the algebraic reciprocals of each other. Of course, they are not reciprocals in the sense that
F(x)=1G(x),(8.79)F^{\prime}\left( x\right) =\frac{1}{G^{\prime}\left( x\right) },\tag{8.79}
which can be immediately seen from the examples above. Instead, the proper relationship can be described like this: if bb is the image of aa under FF, i.e.
F(a)=b(8.75)F\left( a\right) =b\tag{8.75}
or, equivalently,
G(b)=a,(8.76)G\left( b\right) =a,\tag{8.76}
then
F(a)=1G(b).(8.80)F^{\prime}\left( a\right) =\frac{1}{G^{\prime}\left( b\right) }.\tag{8.80}
In other words, the derivatives are the reciprocals of each other at the appropriate values of their arguments -- namely GG^{\prime} must be evaluated at F(a)F\left( a\right) , i.e. the image of aa, rather than aa itself. It is left as an exercise for the reader to remind themselves why this relationship makes perfect sense when one considers the relationship between the graphs of the inverse functions F(x)F\left( x\right) and G(x)G\left( x\right) .
Let us confirm this relationship for one of the examples above. Suppose that F(x)=exF\left( x\right) =e^{x}, G(x)=lnxG\left( x\right) =\ln x, and thus b=eab=e^{a}. Since F(x)=exF^{\prime}\left( x\right) =e^{x} and G(x)=1/xG^{\prime}\left( x\right) =1/x, we have
F(a)=ea          (8.81)G(b)=1b.          (8.82)\begin{aligned}F^{\prime}\left( a\right) & =e^{a}\ \ \ \ \ \ \ \ \ \ \left(8.81\right)\\G^{\prime}\left( b\right) & =\frac{1}{b}.\ \ \ \ \ \ \ \ \ \ \left(8.82\right)\end{aligned}
Since b=eab=e^{a}, we have
G(b)=1b=1ea(8.83)G^{\prime}\left( b\right) =\frac{1}{b}=\frac{1}{e^{a}}\tag{8.83}
and therefore the relationship
F(a)=1G(b)(8.80)F^{\prime}\left( a\right) =\frac{1}{G^{\prime}\left( b\right) }\tag{8.80}
indeed holds since both sides equal eae^{a}.
Let us now present a derivation of the relationship
F(a)=1G(b),(8.80)F^{\prime}\left( a\right) =\frac{1}{G^{\prime}\left( b\right) },\tag{8.80}
which will serve as a blueprint for deriving the analogous relationship in the multidimensional case. Most differential relationships are derived by forming an identity with respect to the independent variables and subsequently evaluating the derivative of both sides with respect to each variable. In this case of inverse functions of one variable, the identity reads
G(F(x))=x.(8.84)G\left( F\left( x\right) \right) =x.\tag{8.84}
In words, it states that applying GG to F(x)F\left( x\right) recovers the value of xx. An application of the chain rule yields
G(F(x))F(x)=1,(8.85)G^{\prime}\left( F\left( x\right) \right) F^{\prime}\left( x\right) =1,\tag{8.85}
from which we have
F(x)=1G(F(x)),(8.86)F^{\prime}\left( x\right) =\frac{1}{G^{\prime}\left( F\left( x\right) \right) },\tag{8.86}
which is precisely the relationship we set out to derive.
We will now turn our attention to inverse pairs of functions of two variables and, subsequently, to the general case of inverse sets of nn functions of nn variables.

8.6.3Inverse sets of two functions of two variables

Denote one pair of functions by F1(u,v)F^{1}\left( u,v\right) and F2(u,v)F^{2}\left( u,v\right) and the inverse pair by G1(U,V)G^{1}\left( U,V\right) and G2(U,V)G^{2}\left( U,V\right) . For the sake of greater clarity, we will use the letters uu and vv for the arguments of FiF^{i}, and UU and VV for the arguments of GiG^{i}. By definition, the two sets of functions are the inverses of each other if
G1(F1(u,v),F2(u,v))=u          (8.87)G2(F1(u,v),F2(u,v))=v.          (8.88)\begin{aligned}G^{1}\left( F^{1}\left( u,v\right) ,F^{2}\left( u,v\right) \right) & =u\ \ \ \ \ \ \ \ \ \ \left(8.87\right)\\G^{2}\left( F^{1}\left( u,v\right) ,F^{2}\left( u,v\right) \right) & =v.\ \ \ \ \ \ \ \ \ \ \left(8.88\right)\end{aligned}
In other words, if F1F^{1} and F2F^{2} send uu and vv to UU and VV, then G1G^{1} and G2G^{2} send UU and VV back to uu and vv. For example, as we showed above, the functions describing the coordinate transformation from Cartesian to polar coordinates -- or between any two coordinate systems, for that matter -- represent such sets of functions.
Differentiate each one of the above identities with respect to uu and vv. Applying the chain rule to the first identity yields
G1UF1u+G1VF2u=1          (8.89)G1UF1v+G1VF2v=0.          (8.90)\begin{aligned}\frac{\partial G^{1}}{\partial U}\frac{\partial F^{1}}{\partial u} +\frac{\partial G^{1}}{\partial V}\frac{\partial F^{2}}{\partial u} & =1\ \ \ \ \ \ \ \ \ \ \left(8.89\right)\\\frac{\partial G^{1}}{\partial U}\frac{\partial F^{1}}{\partial v} +\frac{\partial G^{1}}{\partial V}\frac{\partial F^{2}}{\partial v} & =0.\ \ \ \ \ \ \ \ \ \ \left(8.90\right)\end{aligned}
Doing the same for the second identity yields
G2UF1u+G2VF2u=0          (8.91)G2UF1v+G2VF2v=1.          (8.92)\begin{aligned}\frac{\partial G^{2}}{\partial U}\frac{\partial F^{1}}{\partial u} +\frac{\partial G^{2}}{\partial V}\frac{\partial F^{2}}{\partial u} & =0\ \ \ \ \ \ \ \ \ \ \left(8.91\right)\\\frac{\partial G^{2}}{\partial U}\frac{\partial F^{1}}{\partial v} +\frac{\partial G^{2}}{\partial V}\frac{\partial F^{2}}{\partial v} & =1.\ \ \ \ \ \ \ \ \ \ \left(8.92\right)\end{aligned}
As we did earlier in the Chapter, we omitted the arguments of the functions for the sake of conciseness. A more detailed version of, say, the first identity would read
G1(U,V)UU=F1(u,v),V=F2(u,v)F1(u,v)u          (8.93) +G1(U,V)VU=F1(u,v),V=F2(u,v)F2(u,v)u=1.          (8.94)\begin{aligned}\left. \frac{\partial G^{1}\left( U,V\right) }{\partial U}\right\vert _{U=F^{1}\left( u,v\right) ,V=F^{2}\left( u,v\right) }\frac{\partial F^{1}\left( u,v\right) }{\partial u} &\ \ \ \ \ \ \ \ \ \ \left(8.93\right)\\\ +\left. \frac{\partial G^{1}\left( U,V\right) }{\partial V}\right\vert _{U=F^{1}\left( u,v\right) ,V=F^{2}\left( u,v\right) } & \frac{\partial F^{2}\left( u,v\right) }{\partial u}=1.\ \ \ \ \ \ \ \ \ \ \left(8.94\right)\end{aligned}
However, this level of detail obscures the overall structure of the expressions. We will, therefore, continue to use the concise form, but we must remember that the derivatives of G1G^{1} and G2G^{2} are to be evaluated at U=F1(u,v)U=F^{1}\left( u,v\right) and V=F2(u,v)V=F^{2}\left( u,v\right) .
When we organize the partial derivatives into matrices
[G1UG1VG2UG2V] and [F1uF1vF2uF2v],(8.95)\left[ \begin{array} {cc} \frac{\partial G^{1}}{\partial U} & \frac{\partial G^{1}}{\partial V}\\ \frac{\partial G^{2}}{\partial U} & \frac{\partial G^{2}}{\partial V} \end{array} \right] \text{ and }\left[ \begin{array} {cc} \frac{\partial F^{1}}{\partial u} & \frac{\partial F^{1}}{\partial v}\\ \frac{\partial F^{2}}{\partial u} & \frac{\partial F^{2}}{\partial v} \end{array} \right] ,\tag{8.95}
we observe that the four equations above are captured by the identity
[G1UG1VG2UG2V][F1uF1vF2uF2v]=[1001].(8.96)\left[ \begin{array} {cc} \frac{\partial G^{1}}{\partial U} & \frac{\partial G^{1}}{\partial V}\\ \frac{\partial G^{2}}{\partial U} & \frac{\partial G^{2}}{\partial V} \end{array} \right] \left[ \begin{array} {cc} \frac{\partial F^{1}}{\partial u} & \frac{\partial F^{1}}{\partial v}\\ \frac{\partial F^{2}}{\partial u} & \frac{\partial F^{2}}{\partial v} \end{array} \right] =\left[ \begin{array} {cc} 1 & 0\\ 0 & 1 \end{array} \right] .\tag{8.96}
Thus, the two matrices representing the partial derivatives of the functions FiF^{i} and GiG^{i} are the inverses of each other. This is our central conclusion, and it is a direct generalization of the one-dimensional identity
F(x)=1G(F(x)).(8.86)F^{\prime}\left( x\right) =\frac{1}{G^{\prime}\left( F\left( x\right) \right) }.\tag{8.86}
Let us confirm that the newly discovered relationship holds for the transformation between Cartesian and polar coordinates. Recall that the two coordinate systems are related by the equations
r=x2+y2θ=arctan(x,y)andx=rcosθy=rsinθ.(8.97)\begin{array} {llll} r=\sqrt{x^{2}+y^{2}} & & & \\ \theta=\arctan\left( x,y\right) & & & \end{array} \text{and} \begin{array} {llll} & & & x=r\cos\theta\\ & & & y=r\sin\theta. \end{array}\tag{8.97}
Thus, using the letters xx and yy, and rr and θ\theta for the independent variables, the functions FiF^{i} and GiG^{i} are given by
F1(x,y)=x2+y2F2(x,y)=arctan(x,y)andG1(r,θ)=rcosθG2(r,θ)=rsinθ.(8.65)\begin{array} {lll} F^{1}\left( x,y\right) =\sqrt{x^{2}+y^{2}} & & \\ F^{2}\left( x,y\right) =\arctan\left( x,y\right) & & \end{array} \text{and} \begin{array} {lll} & & G^{1}\left( r,\theta\right) =r\cos\theta\\ & & G^{2}\left( r,\theta\right) =r\sin\theta. \end{array} \tag{8.65}
Evaluating their derivatives, we find
[F1xF1yF2xF2y]=[xx2+y2yx2+y2yx2+y2xx2+y2](8.98)\left[ \begin{array} {cc} \frac{\partial F^{1}}{\partial x} & \frac{\partial F^{1}}{\partial y}\\ \frac{\partial F^{2}}{\partial x} & \frac{\partial F^{2}}{\partial y} \end{array} \right] =\left[ \begin{array} {rr} \frac{x}{\sqrt{x^{2}+y^{2}}} & \frac{y}{\sqrt{x^{2}+y^{2}}}\\ -\frac{y}{x^{2}+y^{2}} & \frac{x}{x^{2}+y^{2}} \end{array} \right]\tag{8.98}
and
[G1rG1θG2rG2θ]=[cosθrsinθsinθrcosθ].(8.99)\left[ \begin{array} {cc} \frac{\partial G^{1}}{\partial r} & \frac{\partial G^{1}}{\partial\theta}\\ \frac{\partial G^{2}}{\partial r} & \frac{\partial G^{2}}{\partial\theta} \end{array} \right] =\left[ \begin{array} {rr} \cos\theta & -r\sin\theta\\ \sin\theta & r\cos\theta \end{array} \right] .\tag{8.99}
Once again, keep in mind that the partial derivatives of GiG^{i} are to be evaluated at
r=F1(x,y)=x2+y2 and          (8.100)θ=F2(x,y)=arctan(x,y)          (8.101)\begin{aligned}r & =F^{1}\left( x,y\right) =\sqrt{x^{2}+y^{2}}\text{ and}\ \ \ \ \ \ \ \ \ \ \left(8.100\right)\\\theta & =F^{2}\left( x,y\right) =\arctan\left( x,y\right)\ \ \ \ \ \ \ \ \ \ \left(8.101\right)\end{aligned}
Performing this substitution, we find
[G1rG1θG2rG2θ]r=x2+y2, θ=arctan(x,y)=[xx2+y2yyx2+y2x].(8.102)\left[ \begin{array} {cc} \frac{\partial G^{1}}{\partial r} & \frac{\partial G^{1}}{\partial\theta}\\ \frac{\partial G^{2}}{\partial r} & \frac{\partial G^{2}}{\partial\theta} \end{array} \right] _{r=\sqrt{x^{2}+y^{2}},\ \theta=\arctan\left( x,y\right) }=\left[ \begin{array} {rr} \frac{x}{\sqrt{x^{2}+y^{2}}} & -y\\ \frac{y}{\sqrt{x^{2}+y^{2}}} & x \end{array} \right] .\tag{8.102}
Then, multiplying the two matrices yields
[xx2+y2yyx2+y2x][xx2+y2yx2+y2yx2+y2xx2+y2]=[1001],(8.103)\left[ \begin{array} {rr} \frac{x}{\sqrt{x^{2}+y^{2}}} & -y\\ \frac{y}{\sqrt{x^{2}+y^{2}}} & x \end{array} \right] \left[ \begin{array} {cc} \frac{x}{\sqrt{x^{2}+y^{2}}} & \frac{y}{\sqrt{x^{2}+y^{2}}}\\ -\frac{y}{x^{2}+y^{2}} & \frac{x}{x^{2}+y^{2}} \end{array} \right] =\left[ \begin{array} {cc} 1 & 0\\ 0 & 1 \end{array} \right] ,\tag{8.103}
which confirms the general identity
[G1UG1VG2UG2V][F1uF1vF2uF2v]=[1001].(8.104)\left[ \begin{array} {cc} \frac{\partial G^{1}}{\partial U} & \frac{\partial G^{1}}{\partial V}\\ \frac{\partial G^{2}}{\partial U} & \frac{\partial G^{2}}{\partial V} \end{array} \right] \left[ \begin{array} {cc} \frac{\partial F^{1}}{\partial u} & \frac{\partial F^{1}}{\partial v}\\ \frac{\partial F^{2}}{\partial u} & \frac{\partial F^{2}}{\partial v} \end{array} \right] =\left[ \begin{array} {cc} 1 & 0\\ 0 & 1 \end{array} \right] .\tag{8.104}

8.6.4Inverse sets of nn functions of nn variables

Finally, let us analyze the general nn-dimensional case by using the tensor notation from start to finish. Suppose that the sets of functions Fi(u1,,un)F^{i}\left( u^{1},\cdots,u^{n}\right) and Gi(U1,Un)G^{i}\left( U^{1},\cdots U^{n}\right) are the inverses of each other. In the tensor notation, this relationship is captured by the single identity
Gi(F1(u1,,un),,F2(u1,,un))=ui.(8.105)G^{i}\left( F^{1}\left( u^{1},\cdots,u^{n}\right) ,\cdots,F^{2}\left( u^{1},\cdots,u^{n}\right) \right) =u^{i}.\tag{8.105}
Differentiate this identity with respect to uju^{j}. In effect, we are simultaneously differentiating each of the nn equations represented by the above identity with respect to each independent variable. We find
Gi(F1(u1,,un),,F2(u1,,un))uj=uiuj.(8.106)\frac{\partial G^{i}\left( F^{1}\left( u^{1},\cdots,u^{n}\right) ,\cdots,F^{2}\left( u^{1},\cdots,u^{n}\right) \right) }{\partial u^{j} }=\frac{\partial u^{i}}{\partial u^{j}}.\tag{8.106}
As we described in Section 7.3, the expression on the right is captured by the Kronecker delta symbol δji\delta_{j}^{i}, i.e.
uiuj=δji.(8.107)\frac{\partial u^{i}}{\partial u^{j}}=\delta_{j}^{i}.\tag{8.107}
Thus, an application of the chain rule yields
GiUkFkuj=δji,(8.108)\frac{\partial G^{i}}{\partial U^{k}}\frac{\partial F^{k}}{\partial u^{j} }=\delta_{j}^{i},\tag{8.108}
were we once again dropped all of the functional arguments for the sake of conciseness. That is all there is to it -- this is the central identity that we set out to establish. That fact that we achieved it in one effortless step is a tribute to the effectiveness of the tensor notation.
As we have done previously, arrange the partial derivatives into the n×nn\times n matrices
[G1U1G1UnGnU1GnUn] and [F1u1F1unFnu1Fnun].(8.109)\left[ \begin{array} {ccc} \frac{\partial G^{1}}{\partial U^{1}} & \cdots & \frac{\partial G^{1} }{\partial U^{n}}\\ \vdots & \ddots & \vdots\\ \frac{\partial G^{n}}{\partial U^{1}} & \cdots & \frac{\partial G^{n} }{\partial U^{n}} \end{array} \right] \text{ and }\left[ \begin{array} {ccc} \frac{\partial F^{1}}{\partial u^{1}} & \cdots & \frac{\partial F^{1} }{\partial u^{n}}\\ \vdots & \ddots & \vdots\\ \frac{\partial F^{n}}{\partial u^{1}} & \cdots & \frac{\partial F^{n} }{\partial u^{n}} \end{array} \right] .\tag{8.109}
In Chapter 13, we will refer to these matrices as the Jacobians JJ and JJ^{\prime} of the coordinate transformation. In terms of the these matrices, of the tensor identity
GiUkFkuj=δki(8.108)\frac{\partial G^{i}}{\partial U^{k}}\frac{\partial F^{k}}{\partial u^{j} }=\delta_{k}^{i}\tag{8.108}
reads
[G1U1G1UnGnU1GnUn][F1u1F1unFnu1Fnun]=[11].(8.110)\left[ \begin{array} {ccc} \frac{\partial G^{1}}{\partial U^{1}} & \cdots & \frac{\partial G^{1} }{\partial U^{n}}\\ \vdots & \ddots & \vdots\\ \frac{\partial G^{n}}{\partial U^{1}} & \cdots & \frac{\partial G^{n} }{\partial U^{n}} \end{array} \right] \left[ \begin{array} {ccc} \frac{\partial F^{1}}{\partial u^{1}} & \cdots & \frac{\partial F^{1} }{\partial u^{n}}\\ \vdots & \ddots & \vdots\\ \frac{\partial F^{n}}{\partial u^{1}} & \cdots & \frac{\partial F^{n} }{\partial u^{n}} \end{array} \right] =\left[ \begin{array} {ccc} 1 & & \\ & \ddots & \\ & & 1 \end{array} \right] .\tag{8.110}
Thus, we have proven the general result that the matrices of partial derivatives of inverse sets of functions are the inverses of each other. As we have already mentioned, this fundamental fact plays a critical role in the construction of the tensor framework. More importantly for our present purposes, the calculation demonstrated the great effectiveness of the tensor notation.
Quadratic form minimization is not particularly relevant to the goals of this book. However, it is an essential problem in Applied Mathematics and, additionally, our discussion will serve as an excellent illustration of one important feature of the tensor notation: its ability to access the individual elements of systems.
Suppose that xx is a vector in the sense of Rn \mathbb{R} ^{n} described in Section 2.7. Denote the entries of xx by x1,,xnx^{1},\cdots,x^{n}. Quadratic form minimization is the task of finding the minimum of the function
F(x1,,xn)=12xTAxxTb(8.111)F\left( x^{1},\cdots,x^{n}\right) =\frac{1}{2}x^{T}Ax-x^{T} b\tag{8.111}
where AA is a symmetric positive definite matrix and bb is an arbitrary vector.
Finding the extremal values of a function is a classical problem in ordinary Calculus. If you recall, the extremal values occur at those points where all nn partial derivatives of F(x1,,xn)F\left( x^{1},\cdots,x^{n}\right) are equal to zero. However, the above form is not conducive to differentiation since the latter requires access to the individual entries of xx. Thus, the tensor notation is far better suited for this calculation.
Let us write the expression for F(x1,,xn)F\left( x^{1},\cdots,x^{n}\right) in the tensor form
F(x1,,xn)=12Aijxixjxibi.(8.112)F\left( x^{1},\cdots,x^{n}\right) =\frac{1}{2}A_{ij}x^{i}x^{j}-x^{i}b_{i}.\tag{8.112}
Since the indices ii and jj are being used for the contractions, we are unable to differentiation the above identity with respect to xix^{i} or xjx^{j}. Indeed, the expressions
(12Aijxixj)xi    and    (xibi)xi(8.113)\frac{\partial\left( \frac{1}{2}A_{ij}x^{i}x^{j}\right) }{\partial x^{i} }\text{ \ \ \ and \ \ \ }\frac{\partial\left( x^{i}b_{i}\right) }{\partial x^{i}}\tag{8.113}
are invalid. Thus, instead, we will differentiate the above identity with respect to xkx^{k}, i.e.
F(x1,,xn)xk=xk(12Aijxixjxibi).(8.114)\frac{\partial F\left( x^{1},\cdots,x^{n}\right) }{\partial x^{k}} =\frac{\partial}{\partial x^{k}}\left( \frac{1}{2}A_{ij}x^{i}x^{j}-x^{i} b_{i}\right) .\tag{8.114}
As we discussed in Section 7.8.3, the product rule applies to contractions as if they were simple products. Therefore, we have
Fxk=12Aijxixkxj+12Aijxixjxkxixkbi.(8.115)\frac{\partial F}{\partial x^{k}}=\frac{1}{2}A_{ij}\frac{\partial x^{i} }{\partial x^{k}}x^{j}+\frac{1}{2}A_{ij}x^{i}\frac{\partial x^{j}}{\partial x^{k}}-\frac{\partial x^{i}}{\partial x^{k}}b_{i}.\tag{8.115}
As we discussed in Section 7.3, the derivatives xi/xk\partial x^{i}/\partial x^{k} and xj/xk\partial x^{j}/\partial x^{k} are perfectly captured by the Kronecker delta symbol, i.e.
xixk=δki          (8.116)xjxk=δkj.          (8.117)\begin{aligned}\frac{\partial x^{i}}{\partial x^{k}} & =\delta_{k}^{i}\ \ \ \ \ \ \ \ \ \ \left(8.116\right)\\\frac{\partial x^{j}}{\partial x^{k}} & =\delta_{k}^{j}.\ \ \ \ \ \ \ \ \ \ \left(8.117\right)\end{aligned}
Thus,
Fxk=12Aijδkixj+12Aijxiδkjδkibi.(8.118)\frac{\partial F}{\partial x^{k}}=\frac{1}{2}A_{ij}\delta_{k}^{i}x^{j} +\frac{1}{2}A_{ij}x^{i}\delta_{k}^{j}-\delta_{k}^{i}b_{i}.\tag{8.118}
Since Aijδki=AkjA_{ij}\delta_{k}^{i}=A_{kj}, Aijδkj=AikA_{ij}\delta_{k}^{j}=A_{ik}, and δkibi=bk\delta_{k}^{i}b_{i}=b_{k}, we have
Fxk=12Akjxj+12Aikxibk.(8.119)\frac{\partial F}{\partial x^{k}}=\frac{1}{2}A_{kj}x^{j}+\frac{1}{2} A_{ik}x^{i}-b_{k}.\tag{8.119}
In order to collect like terms, the independent variables must be enumerated by the same index. This can be achieved by renaming the repeated index jj into ii in the first term, i.e.
Fxk=12Akixi+12Aikxibk.(8.120)\frac{\partial F}{\partial x^{k}}=\frac{1}{2}A_{ki}x^{i}+\frac{1}{2} A_{ik}x^{i}-b_{k}.\tag{8.120}
Next, factor out xix^{i} and switch the order of the terms inside the parentheses, i.e.
Fxk=12(Aik+Aki)xibk.(8.121)\frac{\partial F}{\partial x^{k}}=\frac{1}{2}\left( A_{ik}+A_{ki}\right) x^{i}-b_{k}.\tag{8.121}
Equating the partial derivative F/xk\partial F/\partial x^{k} to zero, we arrive at the linear system that determines the critical values of the independent variables xix^{i}:
12(Aki+Aik)xi=bk.(8.122)\frac{1}{2}\left( A_{ki}+A_{ik}\right) x^{i}=b_{k} .\tag{8.122}
If you prefer to use the indices ii and jj in the final equation, you may rewrite this equation as
12(Aij+Aji)xj=bi.(8.123)\frac{1}{2}\left( A_{ij}+A_{ji}\right) x^{j}=b_{i} .\tag{8.123}
For a symmetric system AijA_{ij}, i.e. Aij=AjiA_{ij}=A_{ji}, the above equation reads
Aijxj=bi.(8.124)A_{ij}x^{j}=b_{i}.\tag{8.124}
This is the classical conclusion for the problem of quadratic form minimization.
For, perhaps, one final time, let us unpack the identity
12(Aij+Aji)xj=bi(8.123)\frac{1}{2}\left( A_{ij}+A_{ji}\right) x^{j}=b_{i}\tag{8.123}
in order to show the elementary equations that it represents. Taking n=3n=3, begin by unpacking the free index ii to reveal the three individual equations
12(A1j+Aj1)xj=b1          (8.125)12(A2j+Aj2)xj=b2          (8.126)12(A3j+Aj3)xj=b3.          (8.127)\begin{aligned}\frac{1}{2}\left( A_{1j}+A_{j1}\right) x^{j} & =b_{1}\ \ \ \ \ \ \ \ \ \ \left(8.125\right)\\\frac{1}{2}\left( A_{2j}+A_{j2}\right) x^{j} & =b_{2}\ \ \ \ \ \ \ \ \ \ \left(8.126\right)\\\frac{1}{2}\left( A_{3j}+A_{j3}\right) x^{j} & =b_{3}.\ \ \ \ \ \ \ \ \ \ \left(8.127\right)\end{aligned}
Next, unpack the contraction in each of the equations, i.e.
A11x1+A12+A212x2+A13+A312x3=b1          (8.128)A21+A122x1+A22x2+A23+A322x3=b2          (8.129)A31+A132x1+A32+A232x2+A33x3=b3.          (8.130)\begin{aligned}A_{11}x^{1}+\frac{A_{12}+A_{21}}{2}x^{2}+\frac{A_{13}+A_{31}}{2}x^{3} & =b_{1}\ \ \ \ \ \ \ \ \ \ \left(8.128\right)\\\frac{A_{21}+A_{12}}{2}x^{1}+A_{22}x^{2}+\frac{A_{23}+A_{32}}{2}x^{3} & =b_{2}\ \ \ \ \ \ \ \ \ \ \left(8.129\right)\\\frac{A_{31}+A_{13}}{2}x^{1}+\frac{A_{32}+A_{23}}{2}x^{2}+A_{33}x^{3} & =b_{3}.\ \ \ \ \ \ \ \ \ \ \left(8.130\right)\end{aligned}
For a symmetric AijA_{ij}, the unpacked version of the corresponding equation
Aijxj=bi(8.124)A_{ij}x^{j}=b_{i}\tag{8.124}
is
A11x1+A12x2+A13x3=b1          (8.131)A21x1+A22x2+A23x3=b2          (8.132)A31x1+A32x2+A33x3=b3.          (8.133)\begin{aligned}A_{11}x^{1}+A_{12}x^{2}+A_{13}x^{3} & =b_{1}\ \ \ \ \ \ \ \ \ \ \left(8.131\right)\\A_{21}x^{1}+A_{22}x^{2}+A_{23}x^{3} & =b_{2}\ \ \ \ \ \ \ \ \ \ \left(8.132\right)\\A_{31}x^{1}+A_{32}x^{2}+A_{33}x^{3} & =b_{3}.\ \ \ \ \ \ \ \ \ \ \left(8.133\right)\end{aligned}
Finally, let us rewrite the equation
12(Aij+Aji)xj=bi(8.123)\frac{1}{2}\left( A_{ij}+A_{ji}\right) x^{j}=b_{i}\tag{8.123}
in the matrix form. If AijA_{ij} corresponds to the matrix AA, then Aij+AjiA_{ij}+A_{ji} corresponds to A+ATA+A^{T} and, therefore, the above equation reads
12(A+AT)x=b.(8.134)\frac{1}{2}\left( A+A^{T}\right) x=b.\tag{8.134}
For a symmetric matrix AA, i.e. A=ATA=A^{T}, this equation assumes the classical form
Ax=b.(8.135)Ax=b.\tag{8.135}
Exercise 8.1For
H(t)=F(c1(t),c2(t),c3(t)),(8.38)H\left( t\right) =F\left( c^{1}\left( t\right) ,c^{2}\left( t\right) ,c^{3}\left( t\right) \right) , \tag{8.38}
show that the second derivative d2H/dt2d^{2}H/dt^{2} is given by
d2Hdt2=2Fzizjdcidtdcjdt+Fzid2cidt2.(8.136)\frac{d^{2}H}{dt^{2}}=\frac{\partial^{2}F}{\partial z^{i}\partial z^{j}} \frac{dc^{i}}{dt}\frac{dc^{j}}{dt}+\frac{\partial F}{\partial z^{i}} \frac{d^{2}c^{i}}{dt^{2}}.\tag{8.136}
Exercise 8.2For
H(u1,u2)=F(c1(u1,u2),c2(u1,u2),c3(u1,u2))(8.45)H\left( u^{1},u^{2}\right) =F\left( c^{1}\left( u^{1},u^{2}\right) ,c^{2}\left( u^{1},u^{2}\right) ,c^{3}\left( u^{1},u^{2}\right) \right) \tag{8.45}
show that the collection of second derivatives 2H/uαuβ\partial^{2}H/\partial u^{\alpha}\partial u^{\beta} is given by
2Huαuβ=2Fzizjciuαcjuβ+Fzi2ciuαuβ(8.137)\frac{\partial^{2}H}{\partial u^{\alpha}\partial u^{\beta}}=\frac{\partial ^{2}F}{\partial z^{i}\partial z^{j}}\frac{\partial c^{i}}{\partial u^{\alpha} }\frac{\partial c^{j}}{\partial u^{\beta}}+\frac{\partial F}{\partial z^{i} }\frac{\partial^{2}c^{i}}{\partial u^{\alpha}\partial u^{\beta}}\tag{8.137}
Exercise 8.3For
Hp(u1,u2)=Fp(c1(u1,u2),c2(u1,u2),c3(u1,u2))(8.54)H^{p}\left( u^{1},u^{2}\right) =F^{p}\left( c^{1}\left( u^{1} ,u^{2}\right) ,c^{2}\left( u^{1},u^{2}\right) ,c^{3}\left( u^{1} ,u^{2}\right) \right) \tag{8.54}
show that the collection of second derivatives 2Hp/uαuβ\partial^{2}H^{p}/\partial u^{\alpha}\partial u^{\beta} is given by
2Hpuαuβ=2Fpzizjciuαcjuβ+Fpzi2ciuαuβ(8.138)\frac{\partial^{2}H^{p}}{\partial u^{\alpha}\partial u^{\beta}}=\frac {\partial^{2}F^{p}}{\partial z^{i}\partial z^{j}}\frac{\partial c^{i} }{\partial u^{\alpha}}\frac{\partial c^{j}}{\partial u^{\beta}}+\frac{\partial F^{p}}{\partial z^{i}}\frac{\partial^{2}c^{i}}{\partial u^{\alpha}\partial u^{\beta}}\tag{8.138}
Exercise 8.4Show that
cosarctan(u,v)=uu2+v2 and          (8.69)sinarctan(u,v)=vu2+v2.          (8.70)\begin{aligned}\cos\arctan\left( u,v\right) & =\frac{u}{\sqrt{u^{2}+v^{2}}}\text{ and}\ \ \ \ \ \ \ \ \ \ \left(8.69\right)\\\sin\arctan\left( u,v\right) & =\frac{v}{\sqrt{u^{2}+v^{2}}}. \ \ \ \ \ \ \ \ \ \ \left(8.70\right)\end{aligned}
Exercise 8.5For
F1(u,v)=u2+v2F2(u,v)=arctan(u,v)andG1(u,v)=ucosvG2(u,v)=usinv,(8.65)\begin{array} {llll} F^{1}\left( u,v\right) =\sqrt{u^{2}+v^{2}} & & & \\ F^{2}\left( u,v\right) =\arctan\left( u,v\right) & & & \end{array} \text{and} \begin{array} {llll} & & & G^{1}\left( u,v\right) =u\cos v\\ & & & G^{2}\left( u,v\right) =u\sin v, \end{array} \tag{8.65}
confirm that
F1(G1(u,v),G2(u,v))=u          (8.139)F2(G1(u,v),G2(u,v))=v.          (8.140)\begin{aligned}F^{1}\left( G^{1}\left( u,v\right) ,G^{2}\left( u,v\right) \right) & =u\ \ \ \ \ \ \ \ \ \ \left(8.139\right)\\F^{2}\left( G^{1}\left( u,v\right) ,G^{2}\left( u,v\right) \right) & =v.\ \ \ \ \ \ \ \ \ \ \left(8.140\right)\end{aligned}
Exercise 8.6Explain why the relationship
F(a)=1G(b)(8.80)F^{\prime}\left( a\right) =\frac{1}{G^{\prime}\left( b\right) } \tag{8.80}
makes sense by describing the relationship between the graphs of the inverse functions F(x)F\left( x\right) and G(x)G\left( x\right) .
Exercise 8.7Confirm the relationship
F(a)=1G(b)(8.80)F^{\prime}\left( a\right) =\frac{1}{G^{\prime}\left( b\right) } \tag{8.80}
for the functions
F(x)=x2andG(x)=xF(x)=sinxandG(x)=arcsinx,(8.141)\begin{array} {lllll} F\left( x\right) =x^{2} & & \text{and} & & G\left( x\right) =\sqrt{x}\\ F\left( x\right) =\sin x & & \text{and} & & G\left( x\right) =\arcsin x, \end{array}\tag{8.141}
as well as a handful of other pairs of inverse functions of your choice.
Exercise 8.8For inverse functions F(x)F\left( x\right) and G(x)G\left( x\right) , derive the second-order equation
G(F(x))F(x)F(x)+G(F(x))F(x)=0(8.142)G^{\prime\prime}\left( F\left( x\right) \right) F^{\prime}\left( x\right) F^{\prime}\left( x\right) +G^{\prime}\left( F\left( x\right) \right) F^{\prime\prime}\left( x\right) =0\tag{8.142}
and verify this identity for the functions
F(x)=x2andG(x)=xF(x)=exandG(x)=lnxF(x)=sinxandG(x)=arcsinx.(8.77)\begin{array} {lllll} F\left( x\right) =x^{2} & & \text{and} & & G\left( x\right) =\sqrt{x}\\ F\left( x\right) =e^{x} & & \text{and} & & G\left( x\right) =\ln x\\ F\left( x\right) =\sin x & & \text{and} & & G\left( x\right) =\arcsin x. \end{array} \tag{8.77}
Exercise 8.9Derive the third-order equation analogous to the one in the previous exercise and test it against the same set of functions.
Exercise 8.10Show that the location of the minimum of the ordinary function
F(x)=12ax2bx,      a>0(8.143)F\left( x\right) =\frac{1}{2}ax^{2}-bx,\ \ \ \ \ \ a\gt 0\tag{8.143}
is given by the equation
ax=b.(8.144)ax=b.\tag{8.144}
Note the complete analogue with the multivariate function
F(x1,,xn)=12xTAxxTb(8.111)F\left( x^{1},\cdots,x^{n}\right) =\frac{1}{2}x^{T}Ax-x^{T} b\tag{8.111}
whose minimum is given by the equation
Ax=b.(8.135)Ax=b.\tag{8.135}
Send feedback to Pavel