8. Illustrative Applications of the Tensor Notation

In this Chapter, we will describe -- in tensor notation -- a number of fundamental concepts that will be highly relevant to the rest of our narrative. Thus, this Chapter will not only help you increase your proficiency in the tensor notation, but will also help you prepare for what is to come. Among other topics, we will cover linear combinations, the dot product, differentiation of multivariable functions, as well as multivariable inverse functions. By the end of the Chapter, we will be ready to return to Euclidean spaces in order to begin constructing the tensor framework.

8.1Linear combinations in the tensor notation

Let us begin with linear combinations, the fundamental algebraic building block of Linear Algebra. As a sum of products, a linear combination has the potential to be interpreted as a contraction but, first, an important adjustment needs to be made.

Consider the decomposition of a vector

\mathbf{U}

as a linear combination of the basis vectors

\mathbf{b}_{1}

\mathbf{b}_{2}

, and

\mathbf{b}_{3}

with coefficients

\alpha_{1}

\alpha_{2}

, and

\alpha_{3}

, i.e.

\mathbf{U}=\alpha_{1}\mathbf{b}_{1}+\alpha_{2}\mathbf{b}_{2}+\alpha _{3}\mathbf{b}_{3}.\tag{8.1}

Since both indices are subscripts, the above summation does not represent a valid contraction. In order to make it one, either the coefficients

\alpha_{i}

or the basis vectors

\mathbf{b}_{i}

must be enumerated by a superscript. We arbitrarily choose the coefficients. With the coefficients denoted by

\alpha^{1}

\alpha^{2}

, and

\alpha^{3}

, the linear combination becomes

\mathbf{U}=\alpha^{1}\mathbf{b}_{1}+\alpha^{2}\mathbf{b}_{2}+\alpha ^{3}\mathbf{b}_{3}.\tag{8.2}

As a result, it is now a valid contraction. Invoking the summation convention, we can now express

\mathbf{U}

by the remarkably compact equation

\mathbf{U}=\alpha^{i}\mathbf{b}_{i}.\tag{8.3}

This example perfectly illustrates the guiding aspect of the tensor framework. Our decision to enumerate the coefficients

\alpha^{i}

by a superscript was governed by nothing other than our desire to invoke the summation convention. However, in doing so, we have made a prediction of how the coefficients

\alpha^{i}

transform under a change of basis. Crucially, our prediction is correct, as will be confirmed in Chapter 14. We have highlighted this surprising aspect of the tensor notation before but it is worth reiterating: it not only captures the properties of objects, but also predicts them.

8.2Equating linear combinations

We will frequently encounter the situation where the decomposition of one and the same vector with respect to one and the same basis is arrived at in two different ways. Since, as it is well known from elementary Linear Algebra, the decomposition is unique, the decomposition coefficients -- i.e. the components -- in the two alternatively-derived expansions must coincide. In other words, from the equality of two linear combinations we can conclude the equality of the coefficients. While this is a straightforward matter, the tensor notation captures linear combinations in such a compact manner that this logic may sometimes prove elusive. It is therefore worth illustrating with a few examples.

Suppose that the basis consists of the vectors

\mathbf{b}_{i}

. Then from

\alpha^{i}\mathbf{b}_{i}=\gamma^{i}\mathbf{b}_{i},\tag{8.4}

we can conclude that

\alpha^{i}=\gamma^{i}.\tag{8.5}

To see how this conclusion is reached, unpack both sides of the former equation, i.e.

\alpha^{1}\mathbf{b}_{1}+\alpha^{2}\mathbf{b}_{2}+\alpha^{3}\mathbf{b} _{3}=\gamma^{1}\mathbf{b}_{1}+\gamma^{2}\mathbf{b}_{2}+\gamma^{3} \mathbf{b}_{3}.\tag{8.6}

With the equation in this form, it becomes clear that the Linear Algebra principle of equating expansion coefficients applies. Also, the unpacked form prevents you from slipping into the incorrect argument of "dividing" both sides of the identity

\alpha^{i}\mathbf{b}_{i}=\gamma^{i}\mathbf{b}_{i}

\mathbf{b}_{i}

. Note that even as your experience with the tensor notation grows, you will not cease to practice unpacking. You will simply learn to imagine and process the unpacked form more quickly.

The equality of linear combinations may appear in more algebraically complicated forms. For example, from the identity

A_{j}^{i}\alpha^{j}\mathbf{b}_{i}=\gamma^{i}\mathbf{b}_{i},\tag{8.7}

we can conclude that

A_{j}^{i}\alpha^{j}=\gamma^{i}.\tag{8.8}

Note that we need not concern ourselves with the interpretation of the combination

A_{j}^{i}\alpha^{j}

in order to reach this conclusion.

In situations where the repeated indices enumerating the elements of the basis are different on the two sides of the equation, e.g.

A_{jk}^{i}\alpha^{j}\beta^{k}\mathbf{b}_{i}=D_{i}^{j}\gamma^{i}\mathbf{b}_{j},\tag{8.9}

you will need to rename one of the pairs of the repeated indices so that the two pairs match. In the above example, switch the roles of

i

and

j

on the right, i.e.

A_{jk}^{i}\alpha^{j}\beta^{k}\mathbf{b}_{i}=D_{j}^{i}\gamma^{j}\mathbf{b}_{i}.\tag{8.10}

From this form of the identity, we are able to conclude that

A_{jk}^{i}\alpha^{j}\beta^{k}=D_{j}^{i}\gamma^{j}\mathbf{.}\tag{8.11}

I hope that the foregoing examples illustrate that while the tensor notation may require some practice, its compactness does not obfuscate the ideas that you may be used to seeing in the unpacked form.

8.3The relationship between matrices relating two bases

In the near future, we will study transformation of systems under a change of coordinates. The corresponding topic in Linear Algebra is referred to as change of basis. In this Section, we will use the tensor notation to demonstrate that the matrices relating the two bases are the inverses of each other.

Suppose that

\mathbf{b}_{i}

and

\mathbf{c}_{i}

are two alternative bases. Let the system

B_{j}^{i}

express the elements of the basis

\mathbf{b}_{j}

with respect to

\mathbf{c}_{i}

, i.e.

\mathbf{b}_{j}=B_{j}^{i}\mathbf{c}_{i},\tag{8.12}

and

C_{j}^{i}

express the elements of the basis

\mathbf{c}_{j}

with respect to

\mathbf{b}_{i}

, i.e.

\mathbf{c}_{j}=C_{j}^{i}\mathbf{b}_{i}.\tag{8.13}

In other words,

B_{j}^{i}

is the component of

\mathbf{b}_{j}

with respect to

\mathbf{c}_{i}

and

C_{j}^{i}

is the component of

\mathbf{c}_{j}

with respect to

\mathbf{b}_{i}

. Because the systems

B_{j}^{i}

and

C_{j}^{i}

perform opposite conversions, we may anticipate that they are the inverses of each other in the matrix sense. This is, indeed, the case. Furthermore, this fact is naturally demonstrated in the tensor notation and will offer us an opportunity to practice combining indicial expressions.

The idea is to substitute the identity

\mathbf{c}_{j}=C_{j}^{i}\mathbf{b} _{i}

into

\mathbf{b}_{j}=B_{j}^{i}\mathbf{c}_{i}

and thus express the basis

\mathbf{b}_{i}

in terms of itself. However, since the equation

\mathbf{c}_{j}=C_{j}^{i}\mathbf{b}_{i}\tag{8.13}

features

\mathbf{c}_{j}

(with a

j

) while the equation

\mathbf{b}_{j}=B_{j}^{i}\mathbf{c}_{i}\tag{8.12}

features

\mathbf{c}_{i}

(with an

i

), substitution is not possible until we properly coordinate the index names. In the identity

\mathbf{c}_{j}=C_{j}^{i}\mathbf{b}_{i}\tag{8.13}

rename

i

into

k

(in order to free up

i

for the next step), i.e.

\mathbf{c}_{j}=C_{j}^{k}\mathbf{b}_{k},\tag{8.14}

and then

j

into

i

, i.e.

\mathbf{c}_{i}=C_{i}^{k}\mathbf{b}_{k}.\tag{8.15}

Next, substitute the above identity into

\mathbf{b}_{j}=B_{j}^{i}\mathbf{c}_{i},\tag{8.12}

which yields

\mathbf{b}_{j}=B_{j}^{i}C_{i}^{k}\mathbf{b}_{k}.\tag{8.16}

From here, one can proceed according to one of two approaches: the tensor-novice or the tensor-expert. We will describe both approaches. The tensor-novice approach analyzes the elementary operations that take place under the hood. The tensor-expert approach showcases the elegance and the efficiency of the tensor notation.

For the tensor-novice approach, begin by unpacking the identity

\mathbf{b}_{j}=B_{j}^{i}C_{i}^{k}\mathbf{b}_{k} \tag{8.16}

on the dummy index

k

, i.e.

\mathbf{b}_{j}=B_{j}^{i}C_{i}^{1}\mathbf{b}_{1}+B_{j}^{i}C_{i}^{2} \mathbf{b}_{2}+B_{j}^{i}C_{i}^{3}\mathbf{b}_{3},\tag{8.17}

and subsequently on the live index

j

which, as we discussed in the previous Chapter, expands the equation into a set of three, i.e.

\begin{aligned}\mathbf{b}_{1} & =B_{1}^{i}C_{i}^{1}\mathbf{b}_{1}+B_{1}^{i}C_{i} ^{2}\mathbf{b}_{2}+B_{1}^{i}C_{i}^{3}\mathbf{b}_{3}\ \ \ \ \ \ \ \ \ \ \left(8.18\right)\\\mathbf{b}_{2} & =B_{2}^{i}C_{i}^{1}\mathbf{b}_{1}+B_{2}^{i}C_{i} ^{2}\mathbf{b}_{2}+B_{2}^{i}C_{i}^{3}\mathbf{b}_{3}\ \ \ \ \ \ \ \ \ \ \left(8.19\right)\\\mathbf{b}_{3} & =B_{3}^{i}C_{i}^{1}\mathbf{b}_{1}+B_{3}^{i}C_{i} ^{2}\mathbf{b}_{2}+B_{3}^{i}C_{i}^{3}\mathbf{b}_{3},\ \ \ \ \ \ \ \ \ \ \left(8.20\right)\end{aligned}

where the identities are still not fully unpacked: each equation contains three un-unpacked contractions on the index

i

The above identities express the elements of the basis

\mathbf{b}_{i}

with respect to itself. From elementary Linear Algebra, we know that there is a unique way of doing so and that is

\begin{aligned}\mathbf{b}_{1} & =1\mathbf{b}_{1}+0\mathbf{b}_{2}+0\mathbf{b}_{3}\ \ \ \ \ \ \ \ \ \ \left(8.21\right)\\\mathbf{b}_{2} & =0\mathbf{b}_{1}+1\mathbf{b}_{2}+0\mathbf{b}_{3}\ \ \ \ \ \ \ \ \ \ \left(8.22\right)\\\mathbf{b}_{3} & =0\mathbf{b}_{1}+0\mathbf{b}_{2}+1\mathbf{b}_{3}.\ \ \ \ \ \ \ \ \ \ \left(8.23\right)\end{aligned}

Since equality of linear combinations implies equality of coefficients, we arrive at the following nine identities:

\begin{array} {lll} B_{1}^{i}C_{i}^{1}=1 & B_{1}^{i}C_{i}^{2}=0 & B_{1}^{i}C_{i}^{3}=0\\ B_{2}^{i}C_{i}^{1}=0 & B_{2}^{i}C_{i}^{2}=1 & B_{2}^{i}C_{i}^{3}=0\\ B_{3}^{i}C_{i}^{1}=0 & B_{3}^{i}C_{i}^{2}=0 & B_{3}^{i}C_{i}^{3}=1. \end{array}\tag{8.24}

Finally, note that with the help of the indicial notation and the Kronecker delta

\delta_{j}^{k}

, these nine identities can be captured by the single tensor equation

B_{j}^{i}C_{i}^{k}=\delta_{j}^{k}.\tag{8.25}

We are now able to formulate our conclusion. Since the Kronecker delta corresponds to the identity matrix, this identity states precisely that the matrices corresponding to

B_{j}^{i}

and

C_{j}^{i}

are the inverses of the other, as we set out to show. The advantage of the approach that we have just described is that it exposed almost every elementary low-level detail of the calculation. On the other hand, its exhaustive details may have obscured the simple algebraic structure of the equations. This approach therefore cannot be considered an effective use of the tensor notation.

The tensor-expert approach brings out the elegance of the argument by keeping the equations compact. In the equation

\mathbf{b}_{j}=B_{j}^{i}C_{i}^{k}\mathbf{b}_{k}, \tag{8.16}

replace

\mathbf{b}_{j}

with the equivalent expression

\delta_{j} ^{k}\mathbf{b}_{k}

. The resulting identity

\delta_{j}^{k}\mathbf{b}_{k}=B_{j}^{i}C_{i}^{k}\mathbf{b}_{k}\tag{8.26}

shows the equivalence of the two linear combinations with respect to the same basis, which is precisely the situation we described earlier. Equating the coefficients, we find

\delta_{j}^{k}=B_{j}^{i}C_{i}^{k}.\tag{8.27}

Switching the sides, i.e.

B_{j}^{i}C_{i}^{k}=\delta_{j}^{k}, \tag{8.25}

we immediately recognize it as precisely the equation we set out to prove.

8.4The dot product in the tensor notation

Recall from Chapter 2, that for two vectors

\mathbf{U}

and

\mathbf{V}

, the matrix form of the component space expression for

\mathbf{U}\cdot\mathbf{V}

reads

\mathbf{U}\cdot\mathbf{V}=U^{T}MV, \tag{2.72}

where

U

and

V

are the

n\times1

matrices representing the components of

\mathbf{U}

and

\mathbf{V}

, and

M

is the

n\times n

matrix of pairwise dot products of the basis vectors

\mathbf{b}_{i}

, i.e.

M=\left[ \begin{array} {ccc} \mathbf{b}_{1}\cdot\mathbf{b}_{1} & \mathbf{b}_{1}\cdot\mathbf{b}_{2} & \mathbf{b}_{1}\cdot\mathbf{b}_{3}\\ \mathbf{b}_{2}\cdot\mathbf{b}_{1} & \mathbf{b}_{2}\cdot\mathbf{b}_{2} & \mathbf{b}_{2}\cdot\mathbf{b}_{3}\\ \mathbf{b}_{3}\cdot\mathbf{b}_{1} & \mathbf{b}_{3}\cdot\mathbf{b}_{2} & \mathbf{b}_{3}\cdot\mathbf{b}_{3} \end{array} \right] . \tag{2.53}

As elegant as the equation

\mathbf{U}\cdot\mathbf{V}=U^{T}MV \tag{2.72}

may be, its tensor analogue is every bit as aesthetically pleasing. In fact, with the introduction of index juggling in Chapter 11, it will become even more so.

The entries of the matrix

M

are naturally enumerated by a pair of subscripts, i.e.

M_{ij}=\mathbf{b}_{i}\cdot\mathbf{b}_{j}.\tag{8.28}

Once again, the choice of subscripts is dictated strictly by the rules of the tensor notation rather than some a priori insight into the nature of the system

M_{ij}

or knowledge of how it transforms under a change of basis. The tensor notation stipulates that all terms in an identity must have matching indicial signatures. Thus, the fact that the expression

\mathbf{b}_{i}\cdot\mathbf{b}_{j}

has two subscripts means that we have no choice but to enumerate the entries of

M

by subscripts. The fact that this choice accurately predicts the manner in which

M_{ij}

transforms under a change of basis, which will be confirmed in Chapter 14, is simply yet another illustration of the great predictive ability of the tensor notation.

Next, following the convention adopted earlier, which was also dictated by the rules of the tensor notation, enumerate the components

U^{i}

and

V^{i}

\mathbf{U}

and

\mathbf{V}

by superscripts, i.e.

\mathbf{U}=U^{i}\mathbf{b}_{i}\text{ \ \ and\ \ \ }\mathbf{V}=V^{i} \mathbf{b}_{i}.\tag{8.29}

We will now show that in terms of

M_{ij}

U^{i}

, and

V^{i}

, the expression for

\mathbf{U}\cdot\mathbf{V}

reads

\mathbf{U}\cdot\mathbf{V}=M_{ij}U^{i}V^{j}.\tag{8.30}

First, let us convince ourselves that this formula is correct by fully unpacking the contractions on the right that represent a sum of nine terms, i.e.

\begin{aligned}\mathbf{U}\cdot\mathbf{V} & =\mathbf{b}_{1}\cdot\mathbf{b}_{1}U^{1} V^{1}+\mathbf{b}_{1}\cdot\mathbf{b}_{2}U^{1}V^{2}+\mathbf{b}_{1} \cdot\mathbf{b}_{3}U^{1}V^{3}\ \ \ \ \ \ \ \ \ \ \\& \ \ \ \ \ +\mathbf{b}_{2}\cdot\mathbf{b}_{1}U^{2}V^{1}+\mathbf{b}_{2} \cdot\mathbf{b}_{2}U^{2}V^{2}+\mathbf{b}_{2}\cdot\mathbf{b}_{3}U^{2}V^{3}\ \ \ \ \ \ \ \ \ \ \left(8.31\right)\\& \ \ \ \ \ \ \ \ \ \ \ \ +\mathbf{b}_{3}\cdot\mathbf{b}_{1}U^{3} V^{1}+\mathbf{b}_{3}\cdot\mathbf{b}_{2}U^{3}V^{2}+\mathbf{b}_{3} \cdot\mathbf{b}_{3}U^{3}V^{3}.\ \ \ \ \ \ \ \ \ \ \end{aligned}

This equations is identical to the equation

\begin{aligned}\mathbf{U}\cdot\mathbf{V} & =U_{1}V_{1}\mathbf{b}_{1}\cdot\mathbf{b} _{1}+U_{1}V_{2}\mathbf{b}_{1}\cdot\mathbf{b}_{2}+U_{1}V_{3}\mathbf{b}_{1} \cdot\mathbf{b}_{3}\ \ \ \ \ \ \ \ \ \ \\& \ \ \ \ \ +U_{2}V_{1}\mathbf{b}_{2}\cdot\mathbf{b}_{1}+U_{2}V_{2} \mathbf{b}_{2}\cdot\mathbf{b}_{2}+U_{2}V_{3}\mathbf{b}_{2}\cdot\mathbf{b} _{3}\ \ \ \ \ \ \ \ \ \ \left(2.69\right)\\& \ \ \ \ \ \ \ \ \ \ \ \ +U_{3}V_{1}\mathbf{b}_{3}\cdot\mathbf{b}_{1} +U_{3}V_{2}\mathbf{b}_{3}\cdot\mathbf{b}_{2}+U_{3}V_{3}\mathbf{b}_{3} \cdot\mathbf{b}_{3}\ \ \ \ \ \ \ \ \ \ \end{aligned}

derived in Chapter 2, except for the fact that the components of the vectors are now enumerated by superscripts. Thus, the combination

M_{ij}U^{i}V^{j}

indeed represents the dot product

\mathbf{U} \cdot\mathbf{V}

Thanks to the summation convention, the expression

M_{ij}U^{i}V^{j}

is every bit as compact as its matrix analogue

U^{T}MV

. Furthermore, the indicial form offers a few notational advantages over the matrix form. First, it does not require the operation of the transpose. As we discussed in the previous Chapter, this speaks to the extreme economy of operations in Tensor Calculus. Second, as we also discussed in the previous Chapter, the order of the multiplicative terms in the expression

M_{ij}U^{i}V^{j}

is immaterial. Third, the expression

M_{ij}U^{i}V^{j}

gives access to the individual entries of the matrix

M

as well as the individual components of

\mathbf{U}

and

\mathbf{V}

. This will prove to be of crucial advantage in numerous applications, including quadratic form minimization discussed below. Finally, note that with the help of index juggling introduced in Chapter 11, the expression

M_{ij}U^{i}V^{j}

will be supplanted by the remarkably compact equivalent

U_{i}V^{i}

which, while valid for all bases, exhibits the utmost simplicity of the dot product expressed with respect to an orthonormal basis.

Next, let us re-derive the identity

\mathbf{U}\cdot\mathbf{V}=M_{ij}U^{i}V^{j}\tag{8.30}

strictly in the tensor notation, as opposed to by matching up

M_{ij} U^{i}V^{j}

with a previously established result. On the one hand, the upcoming calculation is almost too simple to be called a derivation. On the other hand, it does require a careful manipulation of indices and will therefore serve as a worthwhile exercise of the tensor notation.

Start with the decompositions of

\mathbf{U}

and

\mathbf{V}

in terms of the basis

\mathbf{b}_{i}

, i.e.

\mathbf{U}=U^{i}\mathbf{b}_{i}\text{ \ \ and\ \ \ }\mathbf{V}=V^{i} \mathbf{b}_{i}. \tag{8.29}

Since we are about to combine these expressions in a single product, they cannot both use the index

i

-- otherwise, we would end up with an invalid combination

U^{i}\mathbf{b}_{i}\cdot V^{i}\mathbf{b}_{i}

. Thus, we will keep

i

in the expression for

\mathbf{U}

and switch to

j

in the expression for

\mathbf{V}

, i.e.

\mathbf{U}=U^{i}\mathbf{b}_{i}\text{ \ \ and\ \ \ }\mathbf{V}=V^{j} \mathbf{b}_{j}.\tag{8.32}

Dotting the two identities, we find

\mathbf{U}\cdot\mathbf{V}=U^{i}\mathbf{b}_{i}\cdot V^{j}\mathbf{b}_{j}.\tag{8.33}

(Recall from Section 7.8.2 the discussion concerning the subtleties inherent in expressions that feature two or more simultaneous contractions.) Rearrange the terms on the right to bring the two vectors together, i.e.

\mathbf{U}\cdot\mathbf{V}=\mathbf{b}_{i}\cdot\mathbf{b}_{j}U^{i}V^{j}.\tag{8.34}

Since

M_{ij}=\mathbf{b}_{i}\cdot\mathbf{b}_{j}. \tag{8.28}

we arrive at the desired result

\mathbf{U}\cdot\mathbf{V}=M_{ij}U^{i}V^{j}. \tag{8.30}

The foregoing discussion is important for two reasons. First, the dot product is a central operation in Geometry, and therefore in Tensor Calculus, and its component space representation -- later to be referred to as the coordinate space representation -- is of utmost value. Second, the discussion illustrated that the tensor notation is an effective tool for deriving algebraic relationships. Note that in Chapter 2, we essentially guessed the equation

\mathbf{U}\cdot\mathbf{V}=U^{T}MV\tag{2.72}

and subsequently observed its correctness. In this Section, with the help of the tensor notation, we were able to arrive the equivalent equation

\mathbf{U}\cdot\mathbf{V}=M_{ij}U^{i}V^{j}\tag{8.30}

by straightforward algebraic manipulation.

8.5The chain rule in the tensor notation

Many of the fundamental identities in Tensor Calculus are obtained by differentiating identities involving composite functions. As a result, such analyses rely heavily on the use of the chain rule. Fortunately, as we are about to demonstrate, the multivariate chain rule lends itself perfectly to the tensor notation and illustrates another one of its natural applications.

8.5.1The case of one independent variable

We will begin with a function

H\left( t\right)

of a single variable given by the composition

H\left( t\right) =F\left( a\left( t\right) ,b\left( t\right) ,c\left( t\right) \right) ,\tag{8.35}

where

F\left( x,y,z\right)

is a function of three variables, and each of

a\left( t\right)

b\left( t\right)

, and

c\left( t\right)

are functions of

t

. According to the chain rule, the derivative

H^{\prime}\left( t\right)

is given in terms of the partial derivatives of

F\left( x,y,z\right)

and the ordinary derivatives of

a\left( t\right)

b\left( t\right)

, and

c\left( t\right)

by the identity

\frac{dH}{dt}=\frac{\partial F}{\partial x}\frac{da}{dt}+\frac{\partial F}{\partial y}\frac{db}{dt}+\frac{\partial F}{\partial z}\frac{dc}{dt}.\tag{8.36}

Note the convention that we will use throughout our narrative of suppressing the arguments of functions when they are clear from the context. With the full detail of the arguments included, the above equation would read

\begin{aligned}\frac{dH\left( t\right) }{dt} & =\left. \frac{\partial F\left( x,y,z\right) }{\partial x}\right\vert _{x=a\left( t\right) ,y=b\left( t\right) ,z=c\left( t\right) }\frac{da\left( t\right) }{dt}\ \ \ \ \ \ \ \ \ \ \\& \hspace{0.5in}+\left. \frac{\partial F\left( x,y,z\right) }{\partial y}\right\vert _{x=a\left( t\right) ,y=b\left( t\right) ,z=c\left( t\right) }\frac{db\left( t\right) }{dt}\ \ \ \ \ \ \ \ \ \ \left(8.37\right)\\& \hspace{0.5in}\hspace{0.5in}+\left. \frac{\partial F\left( x,y,z\right) }{\partial z}\right\vert _{x=a\left( t\right) ,y=b\left( t\right) ,z=c\left( t\right) }\frac{dc\left( t\right) }{dt},\ \ \ \ \ \ \ \ \ \ \end{aligned}

which makes it evident why we prefer

\frac{dH}{dt}=\frac{\partial F}{\partial x}\frac{da}{dt}+\frac{\partial F}{\partial y}\frac{db}{dt}+\frac{\partial F}{\partial z}\frac{dc}{dt}. \tag{8.36}

Our present goal is to express the right side in the tensor notation. Fortunately, being a sum of products, it is ready to be interpreted as a contraction. To this end, denote the arguments of

F

z^{1}

z^{2}

, and

z^{3}

, turning

F\left( a,b,c\right)

into

F\left( z^{1} ,z^{2},z^{3}\right)

. Also, denote the functions of

a\left( t\right)

b\left( t\right)

, and

c\left( t\right)

c^{1}\left( t\right)

c^{2}\left( t\right)

, and

c^{3}\left( t\right)

. In terms of the new symbols,

H\left( t\right)

is given by

H\left( t\right) =F\left( c^{1}\left( t\right) ,c^{2}\left( t\right) ,c^{3}\left( t\right) \right)\tag{8.38}

while the expression for its derivative reads

\frac{dH}{dt}=\frac{\partial F}{\partial z^{1}}\frac{dc^{1}}{dt} +\frac{\partial F}{\partial z^{2}}\frac{dc^{2}}{dt}+\frac{\partial F}{\partial z^{3}}\frac{dc^{3}}{dt}.\tag{8.39}

Now the expression on the right can be easily captured with the help of the summation convention, i.e.

\frac{dH}{dt}=\frac{\partial F}{\partial z^{i}}\frac{dc^{i}}{dt}.\tag{8.40}

As we discussed at the end of Section 7.3, the index

i

in the symbol

\partial F/\partial z^{i}

can be thought of as a subscript because it is a superscript in the "denominator" of a "fraction". Therefore, the repeated index appears once as a subscript and once as a superscript, thus properly triggering Einstein's summation convention.

Finally, if you still find it helpful to visualize the matrix form of indicial expressions, note that the above identity can be captured by the equation

\frac{dH\left( t\right) }{dt}= \begin{array} {c} \left[ \begin{array} {ccc} \frac{\partial F}{\partial z^{1}} & \frac{\partial F}{\partial z^{2}} & \frac{\partial F}{\partial z^{3}} \end{array} \right] \\ \\ \\ \end{array} \left[ \begin{array} {c} \frac{dc^{1}}{dt}\\ \frac{dc^{2}}{dt}\\ \frac{dc^{3}}{dt} \end{array} \right] .\tag{8.41}

We will continue to provide the corresponding matrix forms for each of the chain rule identities in this Section.

8.5.2The case of several independent variables

Let us now consider a function

H\left( u,v\right)

of two variables formed by composing the function

F\left( x,y,z\right)

with functions of two variables

a\left( u,v\right)

b\left( u,v\right)

, and

c\left( u,v\right)

, i.e.

H\left( u,v\right) =F\left( a\left( u,v\right) ,b\left( u,v\right) ,c\left( u,v\right) \right) .\tag{8.42}

Even though in this example there are only two independent variables, our analysis will apply to functions with an arbitrary number of arguments.

Applying the chain rule for each independent variable, we find

\begin{aligned}\frac{\partial H}{\partial u} & =\frac{\partial F}{\partial x}\frac{\partial a}{\partial u}+\frac{\partial F}{\partial y}\frac{\partial b}{\partial u}+\frac{\partial F}{\partial z}\frac{\partial c}{\partial u}\ \ \ \ \ \ \ \ \ \ \left(8.43\right)\\\frac{\partial H}{\partial v} & =\frac{\partial F}{\partial x}\frac{\partial a}{\partial v}+\frac{\partial F}{\partial y}\frac{\partial b}{\partial v}+\frac{\partial F}{\partial z}\frac{\partial c}{\partial v}.\ \ \ \ \ \ \ \ \ \ \left(8.44\right)\end{aligned}

Let us convert these identities into a single indicial equation. In addition to denoting the independent variables of

F

z^{i}

and the functions

a

b

, and

c

c^{i}

, denote the independent variables as

u^{1}

and

u^{2}

, i.e.

H\left( u^{1},u^{2}\right) =F\left( c^{1}\left( u^{1},u^{2}\right) ,c^{2}\left( u^{1},u^{2}\right) ,c^{3}\left( u^{1},u^{2}\right) \right) .\tag{8.45}

Collectively, we will refer to

u^{1}

and

u^{2}

u^{\alpha}

. The superscript

\alpha

is taken from a different alphabet to highlight the fact that the number of independent variables

u^{\alpha}

is different from the number of the arguments of

F

. In terms of the new symbols, the above differential identities read

\begin{aligned}\frac{\partial H}{\partial u^{1}} & =\frac{\partial F}{\partial z^{1}} \frac{\partial c^{1}}{\partial u^{1}}+\frac{\partial F}{\partial z^{2}} \frac{\partial c^{2}}{\partial u^{1}}+\frac{\partial F}{\partial z^{3}} \frac{\partial c^{3}}{\partial u^{1}}\ \ \ \ \ \ \ \ \ \ \left(8.46\right)\\\frac{\partial H}{\partial u^{2}} & =\frac{\partial F}{\partial z^{1}} \frac{\partial c^{1}}{\partial u^{2}}+\frac{\partial F}{\partial z^{2}} \frac{\partial c^{2}}{\partial u^{2}}+\frac{\partial F}{\partial z^{3}} \frac{\partial c^{3}}{\partial u^{2}}.\ \ \ \ \ \ \ \ \ \ \left(8.47\right)\end{aligned}

We can now "pack" these identities into the single indicial equation

\frac{\partial H}{\partial u^{\alpha}}=\frac{\partial F}{\partial z^{i}} \frac{\partial c^{i}}{\partial u^{\alpha}}.\tag{8.48}

As usual, the repeated index represents a summation while the free index enumerates independent equations. Note that the object

\partial c^{i}/\partial u^{\alpha}

represents six partial derivatives: the derivative of each of the three functions

c^{i}\left( u^{1},u^{2}\right)

with respect to each of the two variables

u^{\alpha}

. If the Latin index is treated as first and the Greek as second, the above identity can be captured in matrix form by the equation

\left[ \begin{array} {cc} \frac{\partial H}{\partial u^{1}} & \frac{\partial H}{\partial u^{2}} \end{array} \right] = \begin{array} {c} \left[ \begin{array} {ccc} \frac{\partial F}{\partial z^{1}} & \frac{\partial F}{\partial z^{2}} & \frac{\partial F}{\partial z^{3}} \end{array} \right] \\ \\ \\ \end{array} \left[ \begin{array} {cc} \frac{\partial c^{1}}{\partial u^{1}} & \frac{\partial c^{1}}{\partial u^{2} }\\ \frac{\partial c^{2}}{\partial u^{1}} & \frac{\partial c^{2}}{\partial u^{2} }\\ \frac{\partial c^{3}}{\partial u^{1}} & \frac{\partial c^{3}}{\partial u^{2}} \end{array} \right] .\tag{8.49}

A note on terminology is in order. The phrase differentiation with respect to

u^{\alpha}

refers to the evaluation of the partial derivatives of a function with respect to each of the independent variables

u^{\alpha}

. However, from the point of view of the mechanics of differentiation, the derivatives are evaluated as if with respect to a single variable, such as

u^{1}

u^{2}

. In other words, thanks to the tensor notation, the simultaneous nature of the operation does not increase the complexity of the analysis compared to the evaluation of a single derivative. As a matter of basic tensor proficiency, you should be able to go fluently from the equation

H\left( u^{1},u^{2}\right) =F\left( c^{1}\left( u^{1},u^{2}\right) ,c^{2}\left( u^{1},u^{2}\right) ,c^{3}\left( u^{1},u^{2}\right) \right) \tag{8.45}

defining

H\left( u^{1},u^{2}\right)

in terms of

F

and

c^{i}

to the equation

\frac{\partial H}{\partial u^{\alpha}}=\frac{\partial F}{\partial z^{i}} \frac{\partial c^{i}}{\partial u^{\alpha}} \tag{8.48}

that gives its derivatives in terms of the derivatives of

F

and

c^{i}

We will now turn our attention to the most general case of several functions of several variables.

8.5.3The case of several functions of several independent variables

Finally, let us consider the general case of several functions

F^{p}

of several variables

z^{i}

composed with a matching number of functions

c^{i}

of several variables

u^{\alpha}

. We have run out of alphabets, so we going to use a letter,

p

, from a different part of the Latin alphabet to indicate that the number of functions

F^{p}

may be different from the number of arguments

z^{i}

in each function

F^{p}

and from the number of independent variables

u^{\alpha}

. For the sake of concreteness, suppose that there are

4

functions

F^{p}

, i.e.

F^{1}

F^{2}

F^{3}

, and

F^{4}

, and thus there are

4

composite functions

H^{p}

, i.e.

\begin{aligned}H^{1}\left( u^{1},u^{2}\right) & =F^{1}\left( c^{1}\left( u^{1} ,u^{2}\right) ,c^{2}\left( u^{1},u^{2}\right) ,c^{3}\left( u^{1} ,u^{2}\right) \right)\ \ \ \ \ \ \ \ \ \ \left(8.50\right)\\H^{2}\left( u^{1},u^{2}\right) & =F^{2}\left( c^{1}\left( u^{1} ,u^{2}\right) ,c^{2}\left( u^{1},u^{2}\right) ,c^{3}\left( u^{1} ,u^{2}\right) \right)\ \ \ \ \ \ \ \ \ \ \left(8.51\right)\\H^{3}\left( u^{1},u^{2}\right) & =F^{3}\left( c^{1}\left( u^{1} ,u^{2}\right) ,c^{2}\left( u^{1},u^{2}\right) ,c^{3}\left( u^{1} ,u^{2}\right) \right)\ \ \ \ \ \ \ \ \ \ \left(8.52\right)\\H^{4}\left( u^{1},u^{2}\right) & =F^{4}\left( c^{1}\left( u^{1} ,u^{2}\right) ,c^{2}\left( u^{1},u^{2}\right) ,c^{3}\left( u^{1} ,u^{2}\right) \right) .\ \ \ \ \ \ \ \ \ \ \left(8.53\right)\end{aligned}

Pack these equations into a single one with the help of a live index

p

, i.e.

H^{p}\left( u^{1},u^{2}\right) =F^{p}\left( c^{1}\left( u^{1} ,u^{2}\right) ,c^{2}\left( u^{1},u^{2}\right) ,c^{3}\left( u^{1} ,u^{2}\right) \right) .\tag{8.54}

Differentiating the combined equation with respect to

u^{1}

and

u^{2}

yields

\begin{aligned}\frac{\partial H^{p}}{\partial u^{1}} & =\frac{\partial F^{p}}{\partial z^{1}}\frac{\partial c^{1}}{\partial u^{1}}+\frac{\partial F^{p}}{\partial z^{2}}\frac{\partial c^{2}}{\partial u^{1}}+\frac{\partial F^{p}}{\partial z^{3}}\frac{\partial c^{3}}{\partial u^{1}}\ \ \ \ \ \ \ \ \ \ \left(8.55\right)\\\frac{\partial H^{p}}{\partial u^{2}} & =\frac{\partial F^{p}}{\partial z^{1}}\frac{\partial c^{1}}{\partial u^{2}}+\frac{\partial F^{p}}{\partial z^{2}}\frac{\partial c^{2}}{\partial u^{2}}+\frac{\partial F^{p}}{\partial z^{3}}\frac{\partial c^{3}}{\partial u^{2}}.\ \ \ \ \ \ \ \ \ \ \left(8.56\right)\end{aligned}

These equations represent a total of

8=4\times2

identities, as each equation represents

4

identities corresponding to

p=1

2

3

, and

4

. Express the contractions on the right by using a dummy index

i

, i.e.

\begin{aligned}\frac{\partial H^{p}}{\partial u^{1}} & =\frac{\partial F^{p}}{\partial z^{i}}\frac{\partial c^{i}}{\partial u^{1}}\ \ \ \ \ \ \ \ \ \ \left(8.57\right)\\\frac{\partial H^{p}}{\partial u^{2}} & =\frac{\partial F^{p}}{\partial z^{i}}\frac{\partial c^{i}}{\partial u^{2}},\ \ \ \ \ \ \ \ \ \ \left(8.58\right)\end{aligned}

and, subsequently, combine the two equations into a single one by using a free index

\alpha

, i.e.

\frac{\partial H^{p}}{\partial u^{\alpha}}=\frac{\partial F^{p}}{\partial z^{i}}\frac{\partial c^{i}}{\partial u^{\alpha}}.\tag{8.59}

This single equation captures the

8

partial derivatives

\partial H^{p}/\partial u^{\alpha}

of the functions

H^{p}\left( u^{1},\upsilon ^{2}\right)

in terms of the partial derivatives of

F^{p}

and

c^{i}

. The application of the chain rule to multivariate composite functions will be one of the most common operations going forward.

Finally, let us give the matrix form of this equation. If the superscript enumerating the functions is considered first and the subscript enumerating the variables second, then the corresponding equation in matrix form reads

\left[ \begin{array} {cc} \frac{\partial H^{1}}{\partial u^{1}} & \frac{\partial H^{1}}{\partial u^{2} }\\ \frac{\partial H^{2}}{\partial u^{1}} & \frac{\partial H^{2}}{\partial u^{2} }\\ \frac{\partial H^{3}}{\partial u^{1}} & \frac{\partial H^{3}}{\partial u^{2} }\\ \frac{\partial H^{4}}{\partial u^{1}} & \frac{\partial H^{4}}{\partial u^{2}} \end{array} \right] =\left[ \begin{array} {ccc} \frac{\partial F^{1}}{\partial z^{1}} & \frac{\partial F^{1}}{\partial z^{2}} & \frac{\partial F^{1}}{\partial z^{3}}\\ \frac{\partial F^{2}}{\partial z^{1}} & \frac{\partial F^{2}}{\partial z^{2}} & \frac{\partial F^{2}}{\partial z^{3}}\\ \frac{\partial F^{3}}{\partial z^{1}} & \frac{\partial F^{3}}{\partial z^{2}} & \frac{\partial F^{3}}{\partial z^{3}}\\ \frac{\partial F^{4}}{\partial z^{1}} & \frac{\partial F^{4}}{\partial z^{2}} & \frac{\partial F^{4}}{\partial z^{3}} \end{array} \right] \left[ \begin{array} {cc} \frac{\partial c^{1}}{\partial u^{1}} & \frac{\partial c^{1}}{\partial u^{2} }\\ \frac{\partial c^{2}}{\partial u^{1}} & \frac{\partial c^{2}}{\partial u^{2} }\\ \frac{\partial c^{3}}{\partial u^{1}} & \frac{\partial c^{3}}{\partial u^{2}} \end{array} \right] .\tag{8.60}

The natural ability of the tensor notation to handle the chain rule pays immediate dividends in the analysis of inverse functions, to which we now turn.

8.6Inverse functions

As we have already mentioned on a number of occasions, the tensor property describes how a system transforms under a change of coordinates. A change of coordinates is, in turn, specified by two sets of inverse functions. In this Section, we will derive the relationship between the partial derivatives of those sets of functions. This exercise will serve the dual purpose of preparing us for future analyses of coordinate changes as well as increasing our fluency with the tensor notation.

8.6.1Coordinate transformations as inverse functions

To describe a change of coordinates in an

n

-dimensional space requires a set of

n

functions of

n

variables. Indeed, we must specify how each of the

n

new coordinates is obtained from the

n

old coordinates. For example, the functions that describe the transformation from Cartesian coordinates to polar coordinates are

\begin{aligned}r\left( x,y\right) & =\sqrt{x^{2}+y^{2}}\ \ \ \ \ \ \ \ \ \ \left(8.61\right)\\\theta\left( x,y\right) & =\arctan\left( x,y\right)\ \ \ \ \ \ \ \ \ \ \left(8.62\right)\end{aligned}

Naturally, the inverse coordinate transformation -- from the new coordinates back to the old coordinates -- is also described by a set of

n

functions of

n

variables. For the same coordinate transformation, the functions that describe the inverse transformation are

\begin{aligned}x\left( r,\theta\right) & =r\cos\theta\ \ \ \ \ \ \ \ \ \ \left(8.63\right)\\y\left( r,\theta\right) & =r\sin\theta.\ \ \ \ \ \ \ \ \ \ \left(8.64\right)\end{aligned}

Thus, we have two sets of functions, i.e. those that translate from old coordinates to new and those that translate from new to old. By definition, the two sets are function inverses of each other. In other words, if one set maps the values

a^{1},\cdots,a^{n}

b^{1},\cdots,b^{n}

, then the other sends

b^{1},\cdots,b^{n}

back to

a^{1},\cdots,a^{n}

Remaining in two dimensions for now, denote the functions that translate the old coordinates to the new coordinates by

F^{1}\left( u,v\right)

and

F^{2}\left( u,v\right)

or, collectively,

F^{i}\left( u,v\right)

. Denote the functions that translate in the opposite direction by

G^{1}\left( u,v\right)

and

G^{2}\left( u,v\right)

or, collectively,

G^{i}\left( u,v\right)

. For the transformation between Cartesian and polar coordinates, we have

\begin{array} {llll} F^{1}\left( u,v\right) =\sqrt{u^{2}+v^{2}} & & & \\ F^{2}\left( u,v\right) =\arctan\left( u,v\right) & & & \end{array} \text{and} \begin{array} {llll} & & & G^{1}\left( u,v\right) =u\cos v\\ & & & G^{2}\left( u,v\right) =u\sin v. \end{array}\tag{8.65}

Let us confirm that these sets of functions are indeed the inverses of each other by evaluating the composite functions

G^{1}\left( F^{1}\left( u,v\right) ,F^{2}\left( u,v\right) \right) \text{ and }G^{2}\left( F^{1}\left( u,v\right) ,F^{2}\left( u,v\right) \right) .\tag{8.66}

We expect

G^{1}\left( F^{1}\left( u,v\right) ,F^{2}\left( u,v\right) \right)

to recover

u

and

G^{2}\left( F^{1}\left( u,v\right) ,F^{2}\left( u,v\right) \right)

to recover

v

. Substituting

F^{1}\left( u,v\right)

and

F^{2}\left( u,v\right)

for the arguments of

G^{1}\left( u,v\right)

and

G^{2}\left( u,v\right)

, we find

\begin{aligned}G^{1}\left( F^{1}\left( u,v\right) ,F^{2}\left( u,v\right) \right) & =\sqrt{u^{2}+v^{2}}\cos\arctan\left( u,v\right)\ \ \ \ \ \ \ \ \ \ \left(8.67\right)\\G^{2}\left( F^{1}\left( u,v\right) ,F^{2}\left( u,v\right) \right) & =\sqrt{u^{2}+v^{2}}\sin\arctan\left( u,v\right) .\ \ \ \ \ \ \ \ \ \ \left(8.68\right)\end{aligned}

It is left as an exercise to show that

\begin{aligned}\cos\arctan\left( u,v\right) & =\frac{u}{\sqrt{u^{2}+v^{2}}}\text{ and}\ \ \ \ \ \ \ \ \ \ \left(8.69\right)\\\sin\arctan\left( u,v\right) & =\frac{v}{\sqrt{u^{2}+v^{2}}} ,\ \ \ \ \ \ \ \ \ \ \left(8.70\right)\end{aligned}

and therefore

\begin{aligned}G^{1}\left( F^{1}\left( u,v\right) ,F^{2}\left( u,v\right) \right) & =\sqrt{u^{2}+v^{2}}\frac{u}{\sqrt{u^{2}+v^{2}}}=u\ \ \ \ \ \ \ \ \ \ \left(8.71\right)\\G^{2}\left( F^{1}\left( u,v\right) ,F^{2}\left( u,v\right) \right) & =\sqrt{u^{2}+v^{2}}\frac{v}{\sqrt{u^{2}+v^{2}}}=v,\ \ \ \ \ \ \ \ \ \ \left(8.72\right)\end{aligned}

as we set us to show. It is left as an exercise to show that composing the functions in the opposite way yields the same result, i.e.

\begin{aligned}F^{1}\left( G^{1}\left( u,v\right) ,G^{2}\left( u,v\right) \right) & =u\ \ \ \ \ \ \ \ \ \ \left(8.73\right)\\F^{2}\left( G^{1}\left( u,v\right) ,G^{2}\left( u,v\right) \right) & =v.\ \ \ \ \ \ \ \ \ \ \left(8.74\right)\end{aligned}

n

dimensions, the collection of the partial derivatives of the functions

F^{i}

with respect each of their independent variables is a second-order system with

n^{2}

elements. The same is true of the partial derivatives of

G^{i}

with respect to each of their independent variables. The two systems are connected by an extraordinarily elegant relationship which we are about to derive. We will discover that the tensor notation, and its particular effectiveness in expressing the chain rule, truly shines in this application.

8.6.2Functions of one variable

Let us start our discussion with functions of one variable. Consider two ordinary functions

F\left( x\right)

and

G\left( x\right)

that are the inverses of each other, i.e. if

F\left( x\right)

maps the number

a

to the number

b

, then

G\left( x\right)

maps

b

back to

a

. In other words, if

F\left( a\right) =b\tag{8.75}

then

G\left( b\right) =a.\tag{8.76}

Here are a few examples of inverse functions that we may use to confirm the derived relationships:

\begin{array} {lllll} F\left( x\right) =x^{2} & & \text{and} & & G\left( x\right) =\sqrt{x}\\ F\left( x\right) =e^{x} & & \text{and} & & G\left( x\right) =\ln x\\ F\left( x\right) =\sin x & & \text{and} & & G\left( x\right) =\arcsin x. \end{array}\tag{8.77}

Let us also document their derivatives:

\begin{array} {lllll} F^{\prime}\left( x\right) =2x & & \text{and} & & G^{\prime}\left( x\right) =\frac{1}{2\sqrt{x}}\\ F^{\prime}\left( x\right) =e^{x} & & \text{and} & & G^{\prime}\left( x\right) =\frac{1}{x}\\ F^{\prime}\left( x\right) =\cos x & & \text{and} & & G^{\prime}\left( x\right) =\frac{1}{\sqrt{1-x^{2}}}. \end{array}\tag{8.78}

While, on the face of it, the derivatives

F^{\prime}\left( x\right)

and

G^{\prime}\left( x\right)

do not appear to be related in an obvious way, you may remember from ordinary Calculus that they are, in a certain sense, the algebraic reciprocals of each other. Of course, they are not reciprocals in the sense that

F^{\prime}\left( x\right) =\frac{1}{G^{\prime}\left( x\right) },\tag{8.79}

which can be immediately seen from the examples above. Instead, the proper relationship can be described like this: if

b

is the image of

a

under

F

, i.e.

F\left( a\right) =b\tag{8.75}

or, equivalently,

G\left( b\right) =a,\tag{8.76}

then

F^{\prime}\left( a\right) =\frac{1}{G^{\prime}\left( b\right) }.\tag{8.80}

In other words, the derivatives are the reciprocals of each other at the appropriate values of their arguments -- namely

G^{\prime}

must be evaluated at

F\left( a\right)

, i.e. the image of

a

, rather than

a

itself. It is left as an exercise for the reader to remind themselves why this relationship makes perfect sense when one considers the relationship between the graphs of the inverse functions

F\left( x\right)

and

G\left( x\right)

Let us confirm this relationship for one of the examples above. Suppose that

F\left( x\right) =e^{x}

G\left( x\right) =\ln x

, and thus

b=e^{a}

. Since

F^{\prime}\left( x\right) =e^{x}

and

G^{\prime}\left( x\right) =1/x

, we have

\begin{aligned}F^{\prime}\left( a\right) & =e^{a}\ \ \ \ \ \ \ \ \ \ \left(8.81\right)\\G^{\prime}\left( b\right) & =\frac{1}{b}.\ \ \ \ \ \ \ \ \ \ \left(8.82\right)\end{aligned}

Since

b=e^{a}

, we have

G^{\prime}\left( b\right) =\frac{1}{b}=\frac{1}{e^{a}}\tag{8.83}

and therefore the relationship

F^{\prime}\left( a\right) =\frac{1}{G^{\prime}\left( b\right) }\tag{8.80}

indeed holds since both sides equal

e^{a}

Let us now present a derivation of the relationship

F^{\prime}\left( a\right) =\frac{1}{G^{\prime}\left( b\right) },\tag{8.80}

which will serve as a blueprint for deriving the analogous relationship in the multidimensional case. Most differential relationships are derived by forming an identity with respect to the independent variables and subsequently evaluating the derivative of both sides with respect to each variable. In this case of inverse functions of one variable, the identity reads

G\left( F\left( x\right) \right) =x.\tag{8.84}

In words, it states that applying

G

F\left( x\right)

recovers the value of

x

. An application of the chain rule yields

G^{\prime}\left( F\left( x\right) \right) F^{\prime}\left( x\right) =1,\tag{8.85}

from which we have

F^{\prime}\left( x\right) =\frac{1}{G^{\prime}\left( F\left( x\right) \right) },\tag{8.86}

which is precisely the relationship we set out to derive.

We will now turn our attention to inverse pairs of functions of two variables and, subsequently, to the general case of inverse sets of

n

functions of

n

variables.

8.6.3Inverse sets of two functions of two variables

Denote one pair of functions by

F^{1}\left( u,v\right)

and

F^{2}\left( u,v\right)

and the inverse pair by

G^{1}\left( U,V\right)

and

G^{2}\left( U,V\right)

. For the sake of greater clarity, we will use the letters

u

and

v

for the arguments of

F^{i}

, and

U

and

V

for the arguments of

G^{i}

. By definition, the two sets of functions are the inverses of each other if

\begin{aligned}G^{1}\left( F^{1}\left( u,v\right) ,F^{2}\left( u,v\right) \right) & =u\ \ \ \ \ \ \ \ \ \ \left(8.87\right)\\G^{2}\left( F^{1}\left( u,v\right) ,F^{2}\left( u,v\right) \right) & =v.\ \ \ \ \ \ \ \ \ \ \left(8.88\right)\end{aligned}

In other words, if

F^{1}

and

F^{2}

send

u

and

v

U

and

V

, then

G^{1}

and

G^{2}

send

U

and

V

back to

u

and

v

. For example, as we showed above, the functions describing the coordinate transformation from Cartesian to polar coordinates -- or between any two coordinate systems, for that matter -- represent such sets of functions.

Differentiate each one of the above identities with respect to

u

and

v

. Applying the chain rule to the first identity yields

\begin{aligned}\frac{\partial G^{1}}{\partial U}\frac{\partial F^{1}}{\partial u} +\frac{\partial G^{1}}{\partial V}\frac{\partial F^{2}}{\partial u} & =1\ \ \ \ \ \ \ \ \ \ \left(8.89\right)\\\frac{\partial G^{1}}{\partial U}\frac{\partial F^{1}}{\partial v} +\frac{\partial G^{1}}{\partial V}\frac{\partial F^{2}}{\partial v} & =0.\ \ \ \ \ \ \ \ \ \ \left(8.90\right)\end{aligned}

Doing the same for the second identity yields

\begin{aligned}\frac{\partial G^{2}}{\partial U}\frac{\partial F^{1}}{\partial u} +\frac{\partial G^{2}}{\partial V}\frac{\partial F^{2}}{\partial u} & =0\ \ \ \ \ \ \ \ \ \ \left(8.91\right)\\\frac{\partial G^{2}}{\partial U}\frac{\partial F^{1}}{\partial v} +\frac{\partial G^{2}}{\partial V}\frac{\partial F^{2}}{\partial v} & =1.\ \ \ \ \ \ \ \ \ \ \left(8.92\right)\end{aligned}

As we did earlier in the Chapter, we omitted the arguments of the functions for the sake of conciseness. A more detailed version of, say, the first identity would read

\begin{aligned}\left. \frac{\partial G^{1}\left( U,V\right) }{\partial U}\right\vert _{U=F^{1}\left( u,v\right) ,V=F^{2}\left( u,v\right) }\frac{\partial F^{1}\left( u,v\right) }{\partial u} &\ \ \ \ \ \ \ \ \ \ \left(8.93\right)\\\ +\left. \frac{\partial G^{1}\left( U,V\right) }{\partial V}\right\vert _{U=F^{1}\left( u,v\right) ,V=F^{2}\left( u,v\right) } & \frac{\partial F^{2}\left( u,v\right) }{\partial u}=1.\ \ \ \ \ \ \ \ \ \ \left(8.94\right)\end{aligned}

However, this level of detail obscures the overall structure of the expressions. We will, therefore, continue to use the concise form, but we must remember that the derivatives of

G^{1}

and

G^{2}

are to be evaluated at

U=F^{1}\left( u,v\right)

and

V=F^{2}\left( u,v\right)

When we organize the partial derivatives into matrices

\left[ \begin{array} {cc} \frac{\partial G^{1}}{\partial U} & \frac{\partial G^{1}}{\partial V}\\ \frac{\partial G^{2}}{\partial U} & \frac{\partial G^{2}}{\partial V} \end{array} \right] \text{ and }\left[ \begin{array} {cc} \frac{\partial F^{1}}{\partial u} & \frac{\partial F^{1}}{\partial v}\\ \frac{\partial F^{2}}{\partial u} & \frac{\partial F^{2}}{\partial v} \end{array} \right] ,\tag{8.95}

we observe that the four equations above are captured by the identity

\left[ \begin{array} {cc} \frac{\partial G^{1}}{\partial U} & \frac{\partial G^{1}}{\partial V}\\ \frac{\partial G^{2}}{\partial U} & \frac{\partial G^{2}}{\partial V} \end{array} \right] \left[ \begin{array} {cc} \frac{\partial F^{1}}{\partial u} & \frac{\partial F^{1}}{\partial v}\\ \frac{\partial F^{2}}{\partial u} & \frac{\partial F^{2}}{\partial v} \end{array} \right] =\left[ \begin{array} {cc} 1 & 0\\ 0 & 1 \end{array} \right] .\tag{8.96}

Thus, the two matrices representing the partial derivatives of the functions

F^{i}

and

G^{i}

are the inverses of each other. This is our central conclusion, and it is a direct generalization of the one-dimensional identity

F^{\prime}\left( x\right) =\frac{1}{G^{\prime}\left( F\left( x\right) \right) }.\tag{8.86}

Let us confirm that the newly discovered relationship holds for the transformation between Cartesian and polar coordinates. Recall that the two coordinate systems are related by the equations

\begin{array} {llll} r=\sqrt{x^{2}+y^{2}} & & & \\ \theta=\arctan\left( x,y\right) & & & \end{array} \text{and} \begin{array} {llll} & & & x=r\cos\theta\\ & & & y=r\sin\theta. \end{array}\tag{8.97}

Thus, using the letters

x

and

y

, and

r

and

\theta

for the independent variables, the functions

F^{i}

and

G^{i}

are given by

\begin{array} {lll} F^{1}\left( x,y\right) =\sqrt{x^{2}+y^{2}} & & \\ F^{2}\left( x,y\right) =\arctan\left( x,y\right) & & \end{array} \text{and} \begin{array} {lll} & & G^{1}\left( r,\theta\right) =r\cos\theta\\ & & G^{2}\left( r,\theta\right) =r\sin\theta. \end{array} \tag{8.65}

Evaluating their derivatives, we find

\left[ \begin{array} {cc} \frac{\partial F^{1}}{\partial x} & \frac{\partial F^{1}}{\partial y}\\ \frac{\partial F^{2}}{\partial x} & \frac{\partial F^{2}}{\partial y} \end{array} \right] =\left[ \begin{array} {rr} \frac{x}{\sqrt{x^{2}+y^{2}}} & \frac{y}{\sqrt{x^{2}+y^{2}}}\\ -\frac{y}{x^{2}+y^{2}} & \frac{x}{x^{2}+y^{2}} \end{array} \right]\tag{8.98}

and

\left[ \begin{array} {cc} \frac{\partial G^{1}}{\partial r} & \frac{\partial G^{1}}{\partial\theta}\\ \frac{\partial G^{2}}{\partial r} & \frac{\partial G^{2}}{\partial\theta} \end{array} \right] =\left[ \begin{array} {rr} \cos\theta & -r\sin\theta\\ \sin\theta & r\cos\theta \end{array} \right] .\tag{8.99}

Once again, keep in mind that the partial derivatives of

G^{i}

are to be evaluated at

\begin{aligned}r & =F^{1}\left( x,y\right) =\sqrt{x^{2}+y^{2}}\text{ and}\ \ \ \ \ \ \ \ \ \ \left(8.100\right)\\\theta & =F^{2}\left( x,y\right) =\arctan\left( x,y\right)\ \ \ \ \ \ \ \ \ \ \left(8.101\right)\end{aligned}

Performing this substitution, we find

\left[ \begin{array} {cc} \frac{\partial G^{1}}{\partial r} & \frac{\partial G^{1}}{\partial\theta}\\ \frac{\partial G^{2}}{\partial r} & \frac{\partial G^{2}}{\partial\theta} \end{array} \right] _{r=\sqrt{x^{2}+y^{2}},\ \theta=\arctan\left( x,y\right) }=\left[ \begin{array} {rr} \frac{x}{\sqrt{x^{2}+y^{2}}} & -y\\ \frac{y}{\sqrt{x^{2}+y^{2}}} & x \end{array} \right] .\tag{8.102}

Then, multiplying the two matrices yields

\left[ \begin{array} {rr} \frac{x}{\sqrt{x^{2}+y^{2}}} & -y\\ \frac{y}{\sqrt{x^{2}+y^{2}}} & x \end{array} \right] \left[ \begin{array} {cc} \frac{x}{\sqrt{x^{2}+y^{2}}} & \frac{y}{\sqrt{x^{2}+y^{2}}}\\ -\frac{y}{x^{2}+y^{2}} & \frac{x}{x^{2}+y^{2}} \end{array} \right] =\left[ \begin{array} {cc} 1 & 0\\ 0 & 1 \end{array} \right] ,\tag{8.103}

which confirms the general identity

\left[ \begin{array} {cc} \frac{\partial G^{1}}{\partial U} & \frac{\partial G^{1}}{\partial V}\\ \frac{\partial G^{2}}{\partial U} & \frac{\partial G^{2}}{\partial V} \end{array} \right] \left[ \begin{array} {cc} \frac{\partial F^{1}}{\partial u} & \frac{\partial F^{1}}{\partial v}\\ \frac{\partial F^{2}}{\partial u} & \frac{\partial F^{2}}{\partial v} \end{array} \right] =\left[ \begin{array} {cc} 1 & 0\\ 0 & 1 \end{array} \right] .\tag{8.104}

8.6.4Inverse sets of $n$ functions of $n$ variables

Finally, let us analyze the general

n

-dimensional case by using the tensor notation from start to finish. Suppose that the sets of functions

F^{i}\left( u^{1},\cdots,u^{n}\right)

and

G^{i}\left( U^{1},\cdots U^{n}\right)

are the inverses of each other. In the tensor notation, this relationship is captured by the single identity

G^{i}\left( F^{1}\left( u^{1},\cdots,u^{n}\right) ,\cdots,F^{2}\left( u^{1},\cdots,u^{n}\right) \right) =u^{i}.\tag{8.105}

Differentiate this identity with respect to

u^{j}

. In effect, we are simultaneously differentiating each of the

n

equations represented by the above identity with respect to each independent variable. We find

\frac{\partial G^{i}\left( F^{1}\left( u^{1},\cdots,u^{n}\right) ,\cdots,F^{2}\left( u^{1},\cdots,u^{n}\right) \right) }{\partial u^{j} }=\frac{\partial u^{i}}{\partial u^{j}}.\tag{8.106}

As we described in Section 7.3, the expression on the right is captured by the Kronecker delta symbol

\delta_{j}^{i}

, i.e.

\frac{\partial u^{i}}{\partial u^{j}}=\delta_{j}^{i}.\tag{8.107}

Thus, an application of the chain rule yields

\frac{\partial G^{i}}{\partial U^{k}}\frac{\partial F^{k}}{\partial u^{j} }=\delta_{j}^{i},\tag{8.108}

were we once again dropped all of the functional arguments for the sake of conciseness. That is all there is to it -- this is the central identity that we set out to establish. That fact that we achieved it in one effortless step is a tribute to the effectiveness of the tensor notation.

As we have done previously, arrange the partial derivatives into the

n\times n

matrices

\left[ \begin{array} {ccc} \frac{\partial G^{1}}{\partial U^{1}} & \cdots & \frac{\partial G^{1} }{\partial U^{n}}\\ \vdots & \ddots & \vdots\\ \frac{\partial G^{n}}{\partial U^{1}} & \cdots & \frac{\partial G^{n} }{\partial U^{n}} \end{array} \right] \text{ and }\left[ \begin{array} {ccc} \frac{\partial F^{1}}{\partial u^{1}} & \cdots & \frac{\partial F^{1} }{\partial u^{n}}\\ \vdots & \ddots & \vdots\\ \frac{\partial F^{n}}{\partial u^{1}} & \cdots & \frac{\partial F^{n} }{\partial u^{n}} \end{array} \right] .\tag{8.109}

In Chapter 13, we will refer to these matrices as the Jacobians

J

and

J^{\prime}

of the coordinate transformation. In terms of the these matrices, of the tensor identity

\frac{\partial G^{i}}{\partial U^{k}}\frac{\partial F^{k}}{\partial u^{j} }=\delta_{k}^{i}\tag{8.108}

reads

\left[ \begin{array} {ccc} \frac{\partial G^{1}}{\partial U^{1}} & \cdots & \frac{\partial G^{1} }{\partial U^{n}}\\ \vdots & \ddots & \vdots\\ \frac{\partial G^{n}}{\partial U^{1}} & \cdots & \frac{\partial G^{n} }{\partial U^{n}} \end{array} \right] \left[ \begin{array} {ccc} \frac{\partial F^{1}}{\partial u^{1}} & \cdots & \frac{\partial F^{1} }{\partial u^{n}}\\ \vdots & \ddots & \vdots\\ \frac{\partial F^{n}}{\partial u^{1}} & \cdots & \frac{\partial F^{n} }{\partial u^{n}} \end{array} \right] =\left[ \begin{array} {ccc} 1 & & \\ & \ddots & \\ & & 1 \end{array} \right] .\tag{8.110}

Thus, we have proven the general result that the matrices of partial derivatives of inverse sets of functions are the inverses of each other. As we have already mentioned, this fundamental fact plays a critical role in the construction of the tensor framework. More importantly for our present purposes, the calculation demonstrated the great effectiveness of the tensor notation.

8.7Quadratic form minimization

Quadratic form minimization is not particularly relevant to the goals of this book. However, it is an essential problem in Applied Mathematics and, additionally, our discussion will serve as an excellent illustration of one important feature of the tensor notation: its ability to access the individual elements of systems.

Suppose that

x

is a vector in the sense of

\mathbb{R} ^{n}

described in Section 2.7. Denote the entries of

x

x^{1},\cdots,x^{n}

. Quadratic form minimization is the task of finding the minimum of the function

F\left( x^{1},\cdots,x^{n}\right) =\frac{1}{2}x^{T}Ax-x^{T} b\tag{8.111}

where

A

is a symmetric positive definite matrix and

b

is an arbitrary vector.

Finding the extremal values of a function is a classical problem in ordinary Calculus. If you recall, the extremal values occur at those points where all

n

partial derivatives of

F\left( x^{1},\cdots,x^{n}\right)

are equal to zero. However, the above form is not conducive to differentiation since the latter requires access to the individual entries of

x

. Thus, the tensor notation is far better suited for this calculation.

Let us write the expression for

F\left( x^{1},\cdots,x^{n}\right)

in the tensor form

F\left( x^{1},\cdots,x^{n}\right) =\frac{1}{2}A_{ij}x^{i}x^{j}-x^{i}b_{i}.\tag{8.112}

Since the indices

i

and

j

are being used for the contractions, we are unable to differentiation the above identity with respect to

x^{i}

x^{j}

. Indeed, the expressions

\frac{\partial\left( \frac{1}{2}A_{ij}x^{i}x^{j}\right) }{\partial x^{i} }\text{ \ \ \ and \ \ \ }\frac{\partial\left( x^{i}b_{i}\right) }{\partial x^{i}}\tag{8.113}

are invalid. Thus, instead, we will differentiate the above identity with respect to

x^{k}

, i.e.

\frac{\partial F\left( x^{1},\cdots,x^{n}\right) }{\partial x^{k}} =\frac{\partial}{\partial x^{k}}\left( \frac{1}{2}A_{ij}x^{i}x^{j}-x^{i} b_{i}\right) .\tag{8.114}

As we discussed in Section 7.8.3, the product rule applies to contractions as if they were simple products. Therefore, we have

\frac{\partial F}{\partial x^{k}}=\frac{1}{2}A_{ij}\frac{\partial x^{i} }{\partial x^{k}}x^{j}+\frac{1}{2}A_{ij}x^{i}\frac{\partial x^{j}}{\partial x^{k}}-\frac{\partial x^{i}}{\partial x^{k}}b_{i}.\tag{8.115}

As we discussed in Section 7.3, the derivatives

\partial x^{i}/\partial x^{k}

and

\partial x^{j}/\partial x^{k}

are perfectly captured by the Kronecker delta symbol, i.e.

\begin{aligned}\frac{\partial x^{i}}{\partial x^{k}} & =\delta_{k}^{i}\ \ \ \ \ \ \ \ \ \ \left(8.116\right)\\\frac{\partial x^{j}}{\partial x^{k}} & =\delta_{k}^{j}.\ \ \ \ \ \ \ \ \ \ \left(8.117\right)\end{aligned}

Thus,

\frac{\partial F}{\partial x^{k}}=\frac{1}{2}A_{ij}\delta_{k}^{i}x^{j} +\frac{1}{2}A_{ij}x^{i}\delta_{k}^{j}-\delta_{k}^{i}b_{i}.\tag{8.118}

Since

A_{ij}\delta_{k}^{i}=A_{kj}

A_{ij}\delta_{k}^{j}=A_{ik}

, and

\delta_{k}^{i}b_{i}=b_{k}

, we have

\frac{\partial F}{\partial x^{k}}=\frac{1}{2}A_{kj}x^{j}+\frac{1}{2} A_{ik}x^{i}-b_{k}.\tag{8.119}

In order to collect like terms, the independent variables must be enumerated by the same index. This can be achieved by renaming the repeated index

j

into

i

in the first term, i.e.

\frac{\partial F}{\partial x^{k}}=\frac{1}{2}A_{ki}x^{i}+\frac{1}{2} A_{ik}x^{i}-b_{k}.\tag{8.120}

Next, factor out

x^{i}

and switch the order of the terms inside the parentheses, i.e.

\frac{\partial F}{\partial x^{k}}=\frac{1}{2}\left( A_{ik}+A_{ki}\right) x^{i}-b_{k}.\tag{8.121}

Equating the partial derivative

\partial F/\partial x^{k}

to zero, we arrive at the linear system that determines the critical values of the independent variables

x^{i}

\frac{1}{2}\left( A_{ki}+A_{ik}\right) x^{i}=b_{k} .\tag{8.122}

If you prefer to use the indices

i

and

j

in the final equation, you may rewrite this equation as

\frac{1}{2}\left( A_{ij}+A_{ji}\right) x^{j}=b_{i} .\tag{8.123}

For a symmetric system

A_{ij}

, i.e.

A_{ij}=A_{ji}

, the above equation reads

A_{ij}x^{j}=b_{i}.\tag{8.124}

This is the classical conclusion for the problem of quadratic form minimization.

For, perhaps, one final time, let us unpack the identity

\frac{1}{2}\left( A_{ij}+A_{ji}\right) x^{j}=b_{i}\tag{8.123}

in order to show the elementary equations that it represents. Taking

n=3

, begin by unpacking the free index

i

to reveal the three individual equations

\begin{aligned}\frac{1}{2}\left( A_{1j}+A_{j1}\right) x^{j} & =b_{1}\ \ \ \ \ \ \ \ \ \ \left(8.125\right)\\\frac{1}{2}\left( A_{2j}+A_{j2}\right) x^{j} & =b_{2}\ \ \ \ \ \ \ \ \ \ \left(8.126\right)\\\frac{1}{2}\left( A_{3j}+A_{j3}\right) x^{j} & =b_{3}.\ \ \ \ \ \ \ \ \ \ \left(8.127\right)\end{aligned}

Next, unpack the contraction in each of the equations, i.e.

\begin{aligned}A_{11}x^{1}+\frac{A_{12}+A_{21}}{2}x^{2}+\frac{A_{13}+A_{31}}{2}x^{3} & =b_{1}\ \ \ \ \ \ \ \ \ \ \left(8.128\right)\\\frac{A_{21}+A_{12}}{2}x^{1}+A_{22}x^{2}+\frac{A_{23}+A_{32}}{2}x^{3} & =b_{2}\ \ \ \ \ \ \ \ \ \ \left(8.129\right)\\\frac{A_{31}+A_{13}}{2}x^{1}+\frac{A_{32}+A_{23}}{2}x^{2}+A_{33}x^{3} & =b_{3}.\ \ \ \ \ \ \ \ \ \ \left(8.130\right)\end{aligned}

For a symmetric

A_{ij}

, the unpacked version of the corresponding equation

A_{ij}x^{j}=b_{i}\tag{8.124}

\begin{aligned}A_{11}x^{1}+A_{12}x^{2}+A_{13}x^{3} & =b_{1}\ \ \ \ \ \ \ \ \ \ \left(8.131\right)\\A_{21}x^{1}+A_{22}x^{2}+A_{23}x^{3} & =b_{2}\ \ \ \ \ \ \ \ \ \ \left(8.132\right)\\A_{31}x^{1}+A_{32}x^{2}+A_{33}x^{3} & =b_{3}.\ \ \ \ \ \ \ \ \ \ \left(8.133\right)\end{aligned}

Finally, let us rewrite the equation

\frac{1}{2}\left( A_{ij}+A_{ji}\right) x^{j}=b_{i}\tag{8.123}

in the matrix form. If

A_{ij}

corresponds to the matrix

A

, then

A_{ij}+A_{ji}

corresponds to

A+A^{T}

and, therefore, the above equation reads

\frac{1}{2}\left( A+A^{T}\right) x=b.\tag{8.134}

For a symmetric matrix

A

, i.e.

A=A^{T}

, this equation assumes the classical form

Ax=b.\tag{8.135}

8.8Exercises

Exercise 8.1For

H\left( t\right) =F\left( c^{1}\left( t\right) ,c^{2}\left( t\right) ,c^{3}\left( t\right) \right) , \tag{8.38}

show that the second derivative

d^{2}H/dt^{2}

is given by

\frac{d^{2}H}{dt^{2}}=\frac{\partial^{2}F}{\partial z^{i}\partial z^{j}} \frac{dc^{i}}{dt}\frac{dc^{j}}{dt}+\frac{\partial F}{\partial z^{i}} \frac{d^{2}c^{i}}{dt^{2}}.\tag{8.136}

Exercise 8.2For

H\left( u^{1},u^{2}\right) =F\left( c^{1}\left( u^{1},u^{2}\right) ,c^{2}\left( u^{1},u^{2}\right) ,c^{3}\left( u^{1},u^{2}\right) \right) \tag{8.45}

show that the collection of second derivatives

\partial^{2}H/\partial u^{\alpha}\partial u^{\beta}

is given by

\frac{\partial^{2}H}{\partial u^{\alpha}\partial u^{\beta}}=\frac{\partial ^{2}F}{\partial z^{i}\partial z^{j}}\frac{\partial c^{i}}{\partial u^{\alpha} }\frac{\partial c^{j}}{\partial u^{\beta}}+\frac{\partial F}{\partial z^{i} }\frac{\partial^{2}c^{i}}{\partial u^{\alpha}\partial u^{\beta}}\tag{8.137}

Exercise 8.3For

H^{p}\left( u^{1},u^{2}\right) =F^{p}\left( c^{1}\left( u^{1} ,u^{2}\right) ,c^{2}\left( u^{1},u^{2}\right) ,c^{3}\left( u^{1} ,u^{2}\right) \right) \tag{8.54}

show that the collection of second derivatives

\partial^{2}H^{p}/\partial u^{\alpha}\partial u^{\beta}

is given by

\frac{\partial^{2}H^{p}}{\partial u^{\alpha}\partial u^{\beta}}=\frac {\partial^{2}F^{p}}{\partial z^{i}\partial z^{j}}\frac{\partial c^{i} }{\partial u^{\alpha}}\frac{\partial c^{j}}{\partial u^{\beta}}+\frac{\partial F^{p}}{\partial z^{i}}\frac{\partial^{2}c^{i}}{\partial u^{\alpha}\partial u^{\beta}}\tag{8.138}

Exercise 8.4Show that

\begin{aligned}\cos\arctan\left( u,v\right) & =\frac{u}{\sqrt{u^{2}+v^{2}}}\text{ and}\ \ \ \ \ \ \ \ \ \ \left(8.69\right)\\\sin\arctan\left( u,v\right) & =\frac{v}{\sqrt{u^{2}+v^{2}}}. \ \ \ \ \ \ \ \ \ \ \left(8.70\right)\end{aligned}

Exercise 8.5For

\begin{array} {llll} F^{1}\left( u,v\right) =\sqrt{u^{2}+v^{2}} & & & \\ F^{2}\left( u,v\right) =\arctan\left( u,v\right) & & & \end{array} \text{and} \begin{array} {llll} & & & G^{1}\left( u,v\right) =u\cos v\\ & & & G^{2}\left( u,v\right) =u\sin v, \end{array} \tag{8.65}

confirm that

\begin{aligned}F^{1}\left( G^{1}\left( u,v\right) ,G^{2}\left( u,v\right) \right) & =u\ \ \ \ \ \ \ \ \ \ \left(8.139\right)\\F^{2}\left( G^{1}\left( u,v\right) ,G^{2}\left( u,v\right) \right) & =v.\ \ \ \ \ \ \ \ \ \ \left(8.140\right)\end{aligned}

Exercise 8.6Explain why the relationship

F^{\prime}\left( a\right) =\frac{1}{G^{\prime}\left( b\right) } \tag{8.80}

makes sense by describing the relationship between the graphs of the inverse functions

F\left( x\right)

and

G\left( x\right)

Exercise 8.7Confirm the relationship

F^{\prime}\left( a\right) =\frac{1}{G^{\prime}\left( b\right) } \tag{8.80}

for the functions

\begin{array} {lllll} F\left( x\right) =x^{2} & & \text{and} & & G\left( x\right) =\sqrt{x}\\ F\left( x\right) =\sin x & & \text{and} & & G\left( x\right) =\arcsin x, \end{array}\tag{8.141}

as well as a handful of other pairs of inverse functions of your choice.

Exercise 8.8For inverse functions

F\left( x\right)

and

G\left( x\right)

, derive the second-order equation

G^{\prime\prime}\left( F\left( x\right) \right) F^{\prime}\left( x\right) F^{\prime}\left( x\right) +G^{\prime}\left( F\left( x\right) \right) F^{\prime\prime}\left( x\right) =0\tag{8.142}

and verify this identity for the functions

\begin{array} {lllll} F\left( x\right) =x^{2} & & \text{and} & & G\left( x\right) =\sqrt{x}\\ F\left( x\right) =e^{x} & & \text{and} & & G\left( x\right) =\ln x\\ F\left( x\right) =\sin x & & \text{and} & & G\left( x\right) =\arcsin x. \end{array} \tag{8.77}

Exercise 8.9Derive the third-order equation analogous to the one in the previous exercise and test it against the same set of functions.

Exercise 8.10Show that the location of the minimum of the ordinary function

F\left( x\right) =\frac{1}{2}ax^{2}-bx,\ \ \ \ \ \ a\gt 0\tag{8.143}

is given by the equation

ax=b.\tag{8.144}

Note the complete analogue with the multivariate function

F\left( x^{1},\cdots,x^{n}\right) =\frac{1}{2}x^{T}Ax-x^{T} b\tag{8.111}

whose minimum is given by the equation

Ax=b.\tag{8.135}

Illustrative Applications of the Tensor Notation

8.1Linear combinations in the tensor notation

8.2Equating linear combinations

8.3The relationship between matrices relating two bases

8.4The dot product in the tensor notation

8.5The chain rule in the tensor notation

8.5.1The case of one independent variable

8.5.2The case of several independent variables

8.5.3The case of several functions of several independent variables

8.6Inverse functions

8.6.1Coordinate transformations as inverse functions

8.6.2Functions of one variable

8.6.3Inverse sets of two functions of two variables

8.6.4Inverse sets of nnn functions of nnn variables

8.7Quadratic form minimization

8.8Exercises

8.6.4Inverse sets of $n$ functions of $n$ variables