7. The Basic Elements of the Tensor Notation

The opening sentence in J.J. McConnell's classic Applications of Tensor Calculus perfectly captures the central role of the tensor notation in our subject: The notation of the absolute differential calculus or, as it is frequently called, the tensor calculus, is so much an integral part of the calculus that once the student has become accustomed to its peculiarities he will have gone a long way towards solving the difficulties of the theory itself.

To elaborate on McConnell's statement, the tensor notation is capable not only of expressing the ideas of Tensor Calculus but, in fact, of guiding them, and the degree to which this influence is observed is unusual, if not unique, in Mathematics. The rules of the tensor notation make working with expressions in some ways akin to putting together a jigsaw puzzle: one immediately knows when pieces do not go together and what other pieces to look for. A vast majority of combinations, that would otherwise deserve consideration, are eliminated due to incompatibilities exposed by the notation. Meanwhile, almost any operation that is valid is worthwhile and is virtually guaranteed to play an important role in some analysis.

Ideally, the presentation of the various elements of the tensor notation would take place at the same time as the fundamental ideas of the subject itself. However, new concepts in Tensor Calculus tend to emerge with such rapidity and rely on the tensor notation to such an extent, that it is of great advantage to be already familiar with the key elements of the notation when studying the concepts. And even though some elements of the notation may arise out of context and therefore appear arbitrary, introducing the notation in advance of the core ideas that it is designed to express is worth it. Also, keep in mind that, a number of elements of the tensor notation as well as a number of core concepts will not be justified anyway until we introduce the tensor property in Chapter 14.

7.1The use of indices

In the coordinate space, geometric objects are represented by what is best described, in programming jargon, as multi-dimensional arrays. The elements of the array may be either numbers or geometric vectors. The terms tensor notation and indicial notation are used interchangeably to describe the use of indices to enumerate such arrays. We will refer to indexed multi-dimensional arrays as systems, particularly when the focus is solely on the data and the manner in which it is organized, rather than its potential to the underlying geometry. In the future, when the focus shifts to the relationship between the data and the geometric objects that they represents, we will switch from the term variant to highlight the dependency of the coordinate representation on the choice of coordinates.

Systems will be denoted by letters followed by a collection of indices -- for instance,

V^{i}

\mathbf{Z}_{i}

Z_{ij}

\varepsilon^{ijk}

R_{\cdot jkl}^{i}

Z_{\alpha}^{i}

, and

\mathbf{S}_{\alpha}

. A distinctive feature of the tensor notation is the use of superscripts alongside subscripts. We first encountered this feature in the previous Chapter where a superscript was used to enumerate coordinates. The use of superscripts may be unexpected to anyone encountering it for the first time since superscripts are usually associated with exponentiation. Be it as it may, the use of superscripts is the foundation of the tensor notation and is an important ingredient of its elegance and power. Much of Tensor Calculus is about the interplay between two opposing tendencies -- captured by the two different types of indices -- working together, in balance, to preserve the underlying geometry in coordinate space representations. Thus, the simultaneous use of subscripts and superscripts reflects the very nature of the subject itself.

Whether an index is a superscript or a subscript is referred to as its placement or flavor. A superscript is also known as a contravariant index and a subscript as a covariant index. The collection of indices associated with a system is known as its indicial signature. Systems that feature both superscripts and subscripts are called mixed. The role of the dot in the indicial signature of the symbol

R_{\cdot jkl}^{i}

will be explained in Section 7.2.

As we already mentioned in Chapter 6 and will study in great detail later, the placement of indices is a means of indicating the manner in which the system transforms under a change of coordinates. For example, as we noted in Section 6.6, affine coordinates

x,y,z

and the associated coordinate basis

\mathbf{i,j,k}

transform by opposite rules. This example foreshadows the central concept of tensors, i.e. systems that transform according to a set of two opposite rules known as covariant and contravariant. The covariant rule describes systems that change in the same manner as the coordinate basis. The contravariant rule describes systems that change in the opposite manner. Thus, a subscript signals the covariant nature of the system while a superscript signals the contravariant nature. Some systems, such as

R_{\cdot jkl}^{i}

, transform according a combination of these rules. Although the precise definitions of covariant, contravariant, and tensor will not be given until Chapter 14, we will use these terms in describing various objects in anticipation of a future clarification.

It is a remarkable aspect of the tensor notation that the placement of indices not only reflects the manner in which a system transforms under a change of coordinates, but can also predict it. On frequent occasions, we will choose a particular placement for an index not because we know how the corresponding system transforms under a change of coordinates, but because the rules of the tensor notation dictate that particular placement. Without exception, every placement chosen in that fashion will prove to be correct in the sense of accurately predicting the transformation properties of the system.

The total number of indices associated with a system is known as its order. For example,

V^{i}

is a first-order system with scalar elements,

\mathbf{Z}_{i}

and

\mathbf{S}_{\alpha}

are first-order systems with vector elements,

Z_{ij}

is a second-order system with scalar elements,

\varepsilon^{ijk}

is a third-order system with scalar elements, and

R_{\cdot jkl}^{i}

is a fourth-order system with scalar elements.

A system of order zero is a single number or vector but, despite their utter simplicity, they are the ultimate goal of every analysis since, as we will soon discover, only systems of order zero can correspond to meaningful geometric quantities.

An indexed symbol, say

B_{j}^{i}

, may refer either to the system as a whole or to one of its individual elements. For example, we may say

B_{j}^{i}

describes curvature, referring to the overall system. Or we may say

B_{j}^{i}

is the

i^{\text{\textit{th}}}

component of the vector

\mathbf{B}_{j}

, in which case we are clearly referring to the individual element. Finally,

B_{j}^{i}

may refer to the individual elements collectively, in which case we will use the plural form, e.g.

B_{j}^{i}

are the components of the vectors

\mathbf{B}_{j}

with respect to the basis

\mathbf{C}_{i}

By convention, the values of Latin indices range from

1

to the dimension

n

of the space. For example, in the three-dimensional space, the symbol

Z^{i}

represents

3

numbers

Z^{1}

Z^{2}

, and

Z^{3}

. Meanwhile, the symbol

Z_{ij}

represents

9

numbers

Z_{11}

Z_{12}

Z_{13}

Z_{21}

Z_{22}

Z_{23}

Z_{31}

Z_{32}

, and

Z_{33}

. In general, a system of order

p

represents

n^{p}

elements.

The elements of a first-order system, such as

U^{i}

, can be naturally arranged into a column or a row matrix. Meanwhile, the elements of a second-order system, such as

Z_{ij}

, can be naturally arranged into a matrix. From a certain point of view, the correspondence between second-order systems and matrices is entirely straightforward. On the other hand, the many differences between the expressive capabilities of the tensor notation and the language of matrices lead to numerous subtleties that will be addressed throughout our narrative, as well as in Chapter 19 which deals with the interplay between Linear Algebra, matrices, and Tensor Calculus. Note, however, that Tensor Calculus does not need the matrix perspective. The entire subject can be presented strictly in the tensor notation without making a single reference to matrices or matrix algebra. Nevertheless, whenever the aforementioned subtleties can be either avoided or easily clarified, we will freely rely on the correspondence between first- and second-order systems and matrices if it helps to elucidate a point or to facilitate a calculation.

By convention, the value of the first index of a second-order system indicates the row of the element in the matrix while the second index indicates the column. Thus, the usual way to arrange the elements of

Z_{ij}

into a matrix is

\left[ \begin{array} {ccc} Z_{11} & Z_{12} & Z_{13}\\ Z_{21} & Z_{22} & Z_{23}\\ Z_{31} & Z_{32} & Z_{33} \end{array} \right] .\tag{7.1}

As we have already stated, such a correspondence between a second-order system and a matrix is completely natural. However, in order to stimulate our tensorial way of thinking, we will always wish to stress the distinction between systems and matrices. We will therefore avoid using the equals sign between systems and matrices and will instead prefer to write

Z_{ij}\text{ corresponds to }\left[ \begin{array} {ccc} Z_{11} & Z_{12} & Z_{13}\\ Z_{21} & Z_{22} & Z_{23}\\ Z_{31} & Z_{32} & Z_{33} \end{array} \right] .\tag{7.2}

The elements of a first-order system, such as

U^{i}

, can be arranged either into a column or a row matrix. There is no universal convention as to which arrangement is preferred. However, for the sake of specificity, we will always use the column arrangement, i.e.

U^{i}\text{ corresponds to }\left[ \begin{array} {c} U^{1}\\ U^{2}\\ U^{3} \end{array} \right] .\tag{7.3}

Finally, there are no set conventions for displaying higher-order systems in a matrix-like manner. For example, a system

\varepsilon^{ijk}

, which consists of

n^{3}=27

numbers, can be imagined either as a three-dimensional array of numbers, or as three

3\times3

matrices, or as a

3\times3

table of column matrices. None of these arrangements are particularly useful and perhaps it is best to think of

\varepsilon^{ijk}

simply as a triple-indexed set of

27

numbers not arranged in any particular way.

Greek letters are used alongside Latin letters when two spaces of different dimensions are simultaneously analyzed. This occurs most commonly in the study of curves and surfaces embedded in a higher-dimensional space. In those situations, Latin indices typically correspond to the higher dimension of the surrounding space while Greek indices correspond to the dimension of the embedded object. A single system can have both Latin and Greek indices. The shift tensor

Z_{\alpha}^{i}

, which will feature in a future book, is an example of such a system. For a two-dimensional surface embedded in a three-dimensional space, the symbol

Z_{\alpha}^{i}

represents the

3\times2=6

numbers

Z_{1}^{1}

Z_{2}^{1}

Z_{1}^{2}

Z_{2}^{2}

Z_{1}^{3}

, and

Z_{2}^{3}

. For the same dimensions, a third-order system

B_{\alpha\beta}^{i}

consists of

3\times2\times2=12

numbers.

As we have already discussed, we will also consider systems whose elements are geometric vectors. For example, in the three-dimensional space, the symbol

\mathbf{Z}_{i}

represents the three vectors

\mathbf{Z}_{1}

\mathbf{Z} _{2}

, and

\mathbf{Z}_{3}

. Meanwhile, for a two-dimensional surfaces embedded in a three-dimensional space, the symbol

\mathbf{S}_{\alpha}

represents the pair of vectors

\mathbf{S}_{1}

\mathbf{S}_{2}

. In the same scenario, the symbol

\mathbf{U}_{\alpha}^{i}

with one Latin and one Greek index represents six vectors:

\mathbf{U}_{1}^{1}

\mathbf{U}_{2}^{1}

\mathbf{U}_{1}^{2}

\mathbf{U}_{2}^{2}

\mathbf{U}_{1}^{3}

, and

\mathbf{U}_{2}^{3}

The use of indices enables us to distinguish symbols by their indicial signature. As a result, we can reuse the same letter for closely related objects and rely on the indicial signature to help us tell them apart. Ultimately, this practice leads to greater clarity and economy of letters. The symbols

Z^{i}

\mathbf{Z}_{i}

Z_{\alpha}^{i}

, and

Z_{ij}

all denote distinct objects with no possibility of confusion. Furthermore, symbols, such as

Z^{i}

and

\mathbf{Z}^{i}

, for two different systems, one consisting of scalars and the other of geometric vectors, can be denoted by the same letter with the same indicial signature since they are still distinguishable by the weight of the font.

The reader should embrace the indicial nature of the tensor notation. Indices make tensor expressions intuitive, dynamic, and beautiful. You may one day come across a frequently misinterpreted quote by Elie Cartan, the inventor of Differential Forms, in which he advised to as far as possible avoid very formal computations where an orgy of tensor indices hides a geometric picture which is often very simple. Cartan was referring to a particular application related to Differential Forms where an alternative non-indicial framework offers a number of advantages. Had Cartan known that his statement will be used as an argument against the tensor notation, he would have certainly rephrased it. Indeed, Cartan's masterpiece Riemannian Geometry in an Orthogonal Frame, from which the quote is taken, is filled with beautiful tensor calculations in the indicial notation.

7.2The order of the indices

For systems of order greater than one, the order of the indices needs to be clear. In other words, we must indicate which index is first, second, and so on. In a system denoted by the symbol

A_{ij}

, the order of the indices is clear:

i

is first and

j

is second. However, in a mixed system, such as

T_{j}^{i}

, it is unclear whether

i

is first and

j

is second or the other way around. For higher-order systems, the ambiguity is even greater. For a system such as

T_{kl}^{ij}

, it is clear that

i

comes before

j

and

k

comes before

l

, but that still leaves six possible orderings.

Agreeing on the order of the indices is important for two reasons. First, it is clearly necessary when representing systems by matrices. The more compelling reason, however, has to do with the essential practice of index juggling which will be introduced in Chapter 11. Without prematurely getting into the details, we note that under index juggling, any index can change its flavor: a subscript can become a superscript and vice versa. As a result, the symbol

T_{j}^{i}

becomes ambiguous, as it can either denote the "original" system or the new one where the superscript became a subscript and the subscript became a superscript. Fortunately, this ambiguity can be avoided by clearly establishing the order of the indices which can be done with the help of notation.

In order to indicate the order of the indices, each index is allocated its own vertical space. If the index is a subscript, the superscript slot will remain open, and if the index is superscript, the subscript slot will remain open. For example, if the intended order of the indices in

T_{kl}^{ij}

kilj

, then the symbol may be written as

T_{k\ \ l}^{\ \ i\ \ j}.\tag{7.4}

The only shortcoming of this notation is that blank spaces do not have reliable widths, especially in print. It is therefore better to let a placeholder, such as a dot "

\cdot

", occupy the empty spot in the space assigned to a particular index. With the help of such a placeholder, the above symbol will appear in the form

T_{k\,\cdot l\,\cdot}^{\,\cdot i\,\cdot j}\ \ \ \ \ \ .\tag{7.5}

Then, if, say, the subscript

k

were to be converted into a superscript by index juggling, the new symbol would become

T_{\,\cdot\,\cdot\,l\,\cdot}^{ki\,\cdot j}\ \ \ \ \ \ ,\tag{7.6}

where the order of the indices remains clear. Incidentally, note that the last dot -- e.g. the third dot on the bottom in the indicial signature of

T_{\,\cdot\,\cdot\,l\,\cdot}^{ki\,\cdot j}

-- can always be safely omitted, yielding the symbol

T_{\,\cdot\,\cdot\,l}^{ki\,\cdot j}

, since it remains clear that

j

is the fourth index. Similarly, in the symbol

T_{\,\cdot \,\cdot\,lj}^{ki\,\cdot\,\cdot}

, the last two dots in the top row of the indicial signature can be omitted, yielding

T_{\,\cdot\,\cdot\,lj}^{ki}

.The most prominent symbol that commonly features the dot placeholder is the Riemann-Christoffel tensor

R_{\cdot mij}^{l}

, which will first appear in Chapter 15 and subsequently take center stage in Chapter 20.

7.3The Kronecker delta $\delta_{j}^{i}$

We will now introduce the ubiquitous Kronecker delta symbol -- or the Kronecker delta, for short. The Kronecker delta is denoted by the symbol

\delta_{j}^{i}

with one contravariant and one covariant index. Ultimately, this placement of indices will not only make sense, but will seem inevitable. In particular, the symbol

\delta_{ij}

will never be used.

By definition, the elements of the Kronecker symbol equal

1

when the two indices have the same value and

0

otherwise, i.e.

\delta_{j}^{i}=\left\{ \begin{array} {l} 1\text{, if }i=j\\ 0\text{, if }i\neq j. \end{array} \right.\tag{7.7}

When elements of

\delta_{j}^{i}

are organized into a matrix, the result is, of course, the identity matrix. For example, in three dimensions

\delta_{j}^{i}\text{ corresponds to }\left[ \begin{array} {ccc} 1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1 \end{array} \right] .\tag{7.8}

The Kronecker delta is a mixed system, i.e. it features both subscripts and superscripts. Therefore, as we discussed in the previous Section, we ought to be precise with regard to the order of its indices. In the case of the Kronecker delta, however, the resulting matrix is symmetric and thus the order of the indices does not matter.

The Kronecker delta has many uses in addition to those that arise from its interpretation as the identity matrix. We will now describe one such application that appears frequently in various analyses, including quadratic form minimization discussed in the next Chapter. It centers around the simple equation

\frac{\partial x^{i}}{\partial x^{j}}=\delta_{j}^{i}\tag{7.9}

whose meaning we are about to describe.

Consider functions

F

G

, and

H

of three variables

x^{1}

x^{2}

, and

x^{3}

. For example, we could consider quite complicated functions, such as

\begin{aligned}F\left( x^{1},x^{2},x^{3}\right) & =x^{1}\cos x^{2}\ \ \ \ \ \ \ \ \ \ \left(7.10\right)\\G\left( x^{1},x^{2},x^{3}\right) & =\frac{x^{2}}{x^{1}}\arctan x^{3}\ \ \ \ \ \ \ \ \ \ \left(7.11\right)\\H\left( x^{1},x^{2},x^{3}\right) & =\ln\left( x^{1}+x^{2}+x^{3}\right) .\ \ \ \ \ \ \ \ \ \ \left(7.12\right)\end{aligned}

Instead, we will consider functions of utmost simplicity -- namely,

\begin{aligned}F\left( x^{1},x^{2},x^{3}\right) & =x^{1}\ \ \ \ \ \ \ \ \ \ \left(7.13\right)\\G\left( x^{1},x^{2},x^{3}\right) & =x^{2}\ \ \ \ \ \ \ \ \ \ \left(7.14\right)\\H\left( x^{1},x^{2},x^{3}\right) & =x^{3}.\ \ \ \ \ \ \ \ \ \ \left(7.15\right)\end{aligned}

For the sake of greater organization, we will switch from the letters

F

G

, and

H

to the symbols

F^{1}

F^{2}

, and

F^{3}

, i.e.

\begin{aligned}F^{1}\left( x^{1},x^{2},x^{3}\right) & =x^{1}\ \ \ \ \ \ \ \ \ \ \left(7.16\right)\\F^{2}\left( x^{1},x^{2},x^{3}\right) & =x^{2}\ \ \ \ \ \ \ \ \ \ \left(7.17\right)\\F^{3}\left( x^{1},x^{2},x^{3}\right) & =x^{3}.\ \ \ \ \ \ \ \ \ \ \left(7.18\right)\end{aligned}

Let us evaluate the partial derivatives of each of these functions with respect to each of the independent variables. We find

\begin{aligned}\frac{\partial F^{1}}{\partial x^{1}} & =1\ \ \ \ \ \ \frac{\partial F^{1} }{\partial x^{2}}=0\ \ \ \ \ \ \frac{\partial F^{1}}{\partial x^{3}}=0\ \ \ \ \ \ \ \ \ \ \left(7.19\right)\\\frac{\partial F^{2}}{\partial x^{1}} & =0\ \ \ \ \ \ \frac{\partial F^{2} }{\partial x^{2}}=1\ \ \ \ \ \ \frac{\partial F^{2}}{\partial x^{3}}=0\ \ \ \ \ \ \ \ \ \ \left(7.20\right)\\\frac{\partial F^{3}}{\partial x^{1}} & =0\ \ \ \ \ \ \frac{\partial F^{3} }{\partial x^{2}}=0\ \ \ \ \ \ \frac{\partial F^{3}}{\partial x^{3}}=1.\ \ \ \ \ \ \ \ \ \ \left(7.21\right)\end{aligned}

With the help of the Kronecker delta, these nine equations can be captured by a single identity, i.e.

\frac{\partial F^{i}}{\partial x^{j}}=\delta_{j}^{i}.\tag{7.22}

Take a moment to interpret this identity for various values of

i

and

j

to confirm that each of the nine preceding equations is perfectly represented.

When the symbol

F^{i}

is replaced with the actual function that it represents, i.e.

x^{i}

, we arrive at the identity that we set out to derive, i.e.

\frac{\partial x^{i}}{\partial x^{j}}=\delta_{j}^{i}. \tag{7.9}

In summary, this equation makes sense if the symbol

x^{i}

in the "numerator" is interpreted as a function (of the three variables

x^{1}

x^{2}

, and

x^{3}

) while the symbol

x^{j}

in the "denominator" is interpreted as one of the independent variables.

Observe the correspondence in placement between the indices on both sides of the equation

\frac{\partial x^{i}}{\partial x^{j}}=\delta_{j}^{i}. \tag{7.9}

The index

i

is a superscript on both sides. Meanwhile, the index

j

is a superscript on the left but a subscript on the right. However, on the left, it appears in the "denominator" and, therefore, can be thought of as being a lower index, i.e. a subscript. This is obviously not a rule based in logic, but rather a notational device for remembering how to place indices that emerge as a result of differentiation. This choice of placement will be vindicated in Chapter 14 where we will show that objects that emerge as a result of partial differentiation actually transform by the rule predicted by the placement of the index as a subscript.

7.4The technique of unpacking

The role of indices is simple and straightforward: to enumerate objects. Therefore, which specific letter (from an appropriate alphabet) represents the index is unimportant and is purely a matter of preference. For example, consider the symbol

U^{i}

which represents the numbers

U^{1}

U^{2}

, and

U^{3}

. The very same set of numbers could also be represented by the symbols

U^{a}

U^{j}

U^{p}

U^{z}

, and so on. Thus, in order to master the tensor notation, one must desensitize oneself to the particular letter choice in an expression and learn to see the actual object that is being represented. For example, for the symbols

U^{a}

U^{i}

U^{j}

U^{p}

U^{z}

, and so on, one must imagine the three numbers

U^{1}

U^{2}

, and

U^{3}

that each of these symbols represents. This practice can be described as unpacking the expression. If two indicial expressions are equivalent in the unpacked form, then they are also equivalent in the indicial form.

The freedom to choose any combination of letters for indices allows a great deal of flexibility in the use of indicial expressions. For example, an object may be introduced as

U^{i}

in one line and then appear as

U^{j}

in the very next. A second-order system

A_{ij}

may appear as

A_{ji}

in another expression and the fact that the two indices are switched does not signal that the system has been transposed in some matrix sense (unless

A_{ij}

and

A_{ji}

appear in the same expression -- more on that in a moment). By the same token, the equations

\Gamma_{jk}^{i}=\mathbf{Z}^{i}\cdot\frac{\partial\mathbf{Z}_{j}}{\partial Z^{k}}\tag{7.23}

and

\Gamma_{st}^{r}=\mathbf{Z}^{r}\cdot\frac{\partial\mathbf{Z}_{s}}{\partial Z^{t}} \tag{7.23}

are completely equivalent. As a result of this equivalence, the two equations share the same equation number (7.23). Throughout our narrative, we will adhere to the practice of assigning the same equation number to equivalent, albeit differently indexed, equations -- that is, unless the indices were specifically renamed for the purpose of the analysis.

A moment ago, we mentioned that the symbols

A_{ij}

and

A_{ji}

represent the same system rather than two different systems related by the matrix transpose. This observation is at the root of the many subtleties that we have alluded to earlier related to the correspondence between systems and matrices. In order to illustrate the equivalence of the symbols

A_{ij}

and

A_{ji}

let us, once again, resort to unpacking. Suppose

A_{ij}

corresponds to the matrix

\left[ \begin{array} {ccc} 1 & 4 & 7\\ 2 & 5 & 8\\ 3 & 6 & 9 \end{array} \right] .\tag{7.24}

For each symbol,

A_{ij}

and

A_{ji}

, consider, for example, the element that corresponds to the first index value of

3

and the second index value of

2

. In both cases, we obtain the same value of

6

. Of course, it was key that we referred to the indices not by their names but by their order in the symbol. This is the proper interpretation intended by the tensor notation. And, according to this intended interpretation, we conclude that the symbols

A_{ij}

and

A_{ji}

represent the same system.

Had we instead considered the element that corresponds to

i=3

and

j=2

, we would have recovered the value of

6

for

A_{ij}

and the value of

8

for

A_{ji}

. As a result, we would have incorrectly concluded that

A_{ij}

and

A_{ji}

represent distinct systems (related by the matrix transpose). The error, of course, lies in referring to indices by their letter names which implies a linkage between the symbols

A_{ij}

and

A_{ji}

that does not exist -- unless both symbols are engaged in a single equation.

To illustrate that possibility, consider the identity

A_{ij}=A_{ji}.\tag{7.25}

In this situation, the values of the indices cannot be assigned independently for

A_{ij}

and

A_{ji}

. In coordinating the values of indices within an equation, we must refer to indices by their names. If, for example, we let

i=1

and

j=2

, then these values must be applied to all symbols in the identity consistently. For these particular values, the identity

A_{ij}=A_{ji} \tag{7.25}

tells us that

A_{12}=A_{21}.\tag{7.26}

Thus, the interplay between like-named indices in the identity

A_{ij}=A_{ji} \tag{7.25}

tells us that the system represented by

A_{ij}

is symmetric in the matrix sense. Such a system

A_{ij}

may, for example, correspond to the symmetric matrix

\left[ \begin{array} {ccc} 1 & 2 & 3\\ 2 & 4 & 5\\ 3 & 5 & 6 \end{array} \right] .\tag{7.27}

Similarly, the identity

A_{ij}=B_{ji}\tag{7.28}

relating two second-order systems

A_{ij}

and

B_{ij}

implies that the matrices representing these systems are the transposes of each other. For example, if

A_{ij}\text{ is represented by }\left[ \begin{array} {ccc} 1 & 4 & 7\\ 2 & 5 & 8\\ 3 & 6 & 9 \end{array} \right] ,\tag{7.29}

then

B_{ij}\text{ is represented by }\left[ \begin{array} {ccc} 1 & 2 & 3\\ 4 & 5 & 6\\ 7 & 8 & 9 \end{array} \right] .\tag{7.30}

Sophisticated indicial expressions can sometimes overwhelm our algebraic intuition. Thus, it is very important to take one's time in the early stages of learning the tensor notation and to make sure that every relationship is thoroughly understood before moving on. The technique of unpacking described in this Section can go a long way in assisting the reader in this endeavor.

7.5Linear combinations of systems

One of the strengths of the indicial notation is its highly restricted set of operations. One of those operations is the addition of systems with identical sets of like-named contravariant and covariant indices. For example, the sum

W^{i}

of two first-order systems

U^{i}

and

V^{i}

is given by

W^{i}=U^{i}+V^{i}.\tag{7.31}

When unpacked, we find that this identity represents three equations, i.e.

\begin{aligned}W^{1} & =U^{1}+V^{1}\ \ \ \ \ \ \ \ \ \ \left(7.32\right)\\W^{2} & =U^{2}+V^{2}\ \ \ \ \ \ \ \ \ \ \left(7.33\right)\\W^{3} & =U^{3}+V^{3}.\ \ \ \ \ \ \ \ \ \ \left(7.34\right)\end{aligned}

In other words, as expected, systems are added in element-wise fashion.

The same is true for higher-order systems. For example, two second-order systems

U_{ij}

and

V_{ij}

can be added to produce another second-order system, i.e.

W_{ij}=U_{ij}+V_{ij}.\tag{7.35}

The indices need not appear in the same order in each term. We could also evaluate the sum

W_{ij}=U_{ij}+V_{ji},\tag{7.36}

where the order of the subscripts in the second system is switched. If we associate

U_{ij}

V_{ij}

, and

W_{ij}

with matrices

U

V

, and

W

, then we may interpret the first equation as

W=U+V

and the second as

W=U+V^{T}

. Such correspondence between indicial and matrix expressions will be discussed in detail in Chapter 19 but, once again, keep in mind that for the purposes of Tensor Calculus, it is not necessary to translate indicial expressions into matrix terms.

Along with addition, systems are also subject to multiplication by numbers and vectors. Using second-order systems as an example, a system

B_{ij}

with scalar elements can be multiplied by a number

\alpha

, resulting in the system

\alpha B_{ij}

, where each element of

B_{ij}

is multiplied by

\alpha

. Similarly,

B_{ij}

can be multiplied by a vector

\mathbf{U}

, resulting in

\mathbf{U}B_{ij}

, where each element of

B_{ij}

is multiplied by

\mathbf{U}

. Similarly, a system

\mathbf{U}_{ij}

with vector elements can be multiplied by a scalar

\alpha

or dot-multiplied by a vector

\mathbf{S}

, resulting in the systems

\alpha\mathbf{U}_{ij}

and

\mathbf{S}\cdot\mathbf{U}_{ij}

. The term linear combination refers to expressions that combine multiplication by numbers and addition, e.g.

\alpha U_{ij}+\beta V_{ij}\tag{7.37}

\alpha\mathbf{U}_{ij}+\beta\mathbf{V}_{ij}.\tag{7.38}

In summary, the operations of addition and multiplication by individual numbers and vectors are defined exactly as we would expect by analogy with matrices. We will now turn our attention to the more interesting tensor multiplication of systems.

7.6Multiplication of systems

The result of multiplying two systems is precisely what the notation suggests. For example,

S_{j}^{i}T_{rs},\tag{7.39}

i.e. the result of multiplying

S_{j}^{i}

and

T_{rs}

, is a fourth-order system that represents all possible pairwise products of elements from

S_{j}^{i}

and

T_{rs}

. This operation is known as tensor multiplication or the tensor product,

The indicial signature of a tensor product is the combination of the signatures of each factor. For example, the above product could be denoted by

U_{jrs}^{i}

, i.e.

U_{jrs}^{i}=S_{j}^{i}T_{rs}.\tag{7.40}

Once again stepping into the realm of the subtleties of the tensor notation, the same product could also be denoted by the symbol

U_{rsj}^{i}

featuring a different order of indices. The choice of the order of indices in the resulting system is entirely up to the analyst.

Tensor multiplication is commutative, i.e.

S_{j}^{i}T_{rs}=T_{rs}S_{j}^{i}.\tag{7.41}

This is obviously so since the symbols

S_{j}^{i}

and

T_{rs}

represent the individual elements of each system. Being plain numbers, the individual elements commute according to the rules of elementary arithmetic. This trivial fact is worth pointing out since we will soon discover that tensor multiplication, in combination with (the about-to-be-defined) contraction, is capable of expressing matrix multiplication which is famously noncommutative.

Finally, we note that tensor multiplication rarely takes place without actually being followed up by contraction, so let us now turn our attention to that operation.

7.7Contraction

For a system of order greater than or equal to

2

, contraction refers to a summation over a pair of indices which leads to the reduction of the order of the system by

2

. For example, contracting the third-order system

A_{jk}^{i}

i

and

j

yields a first-order system

B_{k}

given by

B_{k}=A_{1k}^{1}+A_{2k}^{2}+A_{3k}^{3}.\tag{7.42}

Thus, contraction can take place only on indices that change over identical ranges. In other words, a system

Z_{\alpha}^{i}

, where

i

changes from

1

3

and

\alpha

changes from

1

2

, cannot be contracted.

Furthermore, it is stipulated that the contracted indices must be of opposite flavors, i.e. one of the indices must be a superscript and the other a subscript. This requirement is the hallmark of a valid contraction and is the key to the guiding ability of the tensor notation. In fact, it will guide us extensively over the next few chapters in introducing new objects and establishing key relationships. However, the logical justification for this requirement based on the concept of a tensor will not be given until Chapter 14. Fortunately, we can wait this long precisely because the rules of the tensor notation assure the correctness of one's analysis even when one is not aware of their logical underpinning.

Before we give several generic examples of valid contractions, let us review a few instances of summations that, although they violate the opposite-flavors requirement, nevertheless demonstrate that summations over pairs of indices play an omnipresent role in investigations related to Geometry. The reason why these examples violate the opposite-flavors requirement lies simply in the fact that we have not yet adjusted the placements of the indices so as to reflect the true nature of the corresponding objects. Once that adjustment as made, these summations will become valid contractions.

For the first example, note that the classical decomposition of a vector with respect to a basis, i.e.

\mathbf{U}=U_{1}\mathbf{b}_{1}+U_{2}\mathbf{b}_{2}+U_{3}\mathbf{b}_{3}, \tag{2.40}

is, in fact, a summation of the product

U_{i}\mathbf{b}_{j}

over the pair of the available indices resulting in a system of order zero

\mathbf{U}

. Secondly, the dot product of two vectors

\mathbf{U}=U_{1}\mathbf{b}_{1}+U_{2}\mathbf{b}_{2}+U_{3}\mathbf{b}_{3}\text{ \ \ \ and \ \ \ }\mathbf{V}=V_{1}\mathbf{b}_{1}+V_{2}\mathbf{b}_{2} +V_{3}\mathbf{b}_{3},\tag{7.43}

represented by the double sum

\mathbf{U}\cdot\mathbf{V}=\sum_{j=1}^{3}\sum_{i=1}^{3}U_{i}V_{j}\mathbf{b} _{i}\cdot\mathbf{b}_{j}, \tag{2.70}

is a series of two summations of the same kind. It can also be referred to as a double summation since the two summations can be performed in any order and, in fact, simultaneously, which is why the form

\sum_{i,j=1}^{3}U_{i}V_{j}\mathbf{b}_{i}\cdot\mathbf{b}_{j}\tag{7.44}

with a single summation sign is often used. We may describe this expression is a double (albeit, invalid) contraction of the system

U_{i}V_{j}\mathbf{b} _{k}\cdot\mathbf{b}_{l}

i

and

k

as well as

j

and

l

. When the basis

\mathbf{b}_{1},\mathbf{b}_{2},\mathbf{b}_{3}

is orthonormal, the double contraction becomes a single (still invalid) contraction of the system

U_{i}V_{j}

i

and

j

, i.e.

\mathbf{U}\cdot\mathbf{V}=U_{1}V_{1}+U_{2}V_{2}+U_{3}V_{3}.\tag{7.45}

Finally, the trace of a matrix, i.e. the sum of its diagonal elements, is another example of a summation over a pair of indices. For example, the trace of the matrix corresponding to the system

A_{ij}

is the number

A_{11}+A_{22}+A_{33}.\tag{7.46}

As you can see from these examples, summations over pairs of indices reduce sets of

n^{2}

elements to a single value.

Let us know give a number of generic examples of valid contractions. Let us stay in three dimensions where each index assumes the values

1

2

, and

3

. Then, for a mixed second-order system

S_{j}^{i}

, the only possible contraction is the sum of the elements that correspond to matching values of its two indices, i.e.

S_{1}^{1}+S_{2}^{2}+S_{3}^{3}.\tag{7.47}

Of course, if

S_{j}^{i}

is thought of as a matrix, then the contraction corresponds to its trace.

For a higher-order system, a contraction takes place over a particular superscript-subscript pair and is performed for every combination of the remaining indices. For example, contacting a third-order system

S_{jk}^{i}

i

and

j

leads to the sum

S_{1k}^{1}+S_{2k}^{2}+S_{3k}^{3}\tag{7.48}

for each value of

k

. The resulting values form a first-order system

T_{k}

, i.e.

T_{k}=S_{1k}^{1}+S_{2k}^{2}+S_{3k}^{3}.\tag{7.49}

For a fourth-order example, contracting a system

S_{kl}^{ij}

i

and

l

yields the sums

S_{k1}^{1j}+S_{k2}^{2j}+S_{k3}^{3j}\tag{7.50}

which can be organized into a second-order system

T_{k}^{j}

, i.e.

T_{k}^{j}=S_{k1}^{1j}+S_{k2}^{2j}+S_{k3}^{3j}.\tag{7.51}

Each of these examples illustrates that contractions indeed reduce the order of the system by

2

Although it will be shortly supplanted by Einstein's famous summation convention, the summation sign can be effectively used to represent contractions. For example, the equation

T_{k}=S_{1k}^{1}+S_{2k}^{2}+S_{3k}^{3} \tag{7.49}

can be written more compactly in the form

T_{k}=\sum_{i=1}^{n}S_{ik}^{i}\tag{7.52}

where we used

n

as the upper limit for the sake of greater generality. Also, the choice to use the letter

i

for both indices is arbitrary. We could just as well use the letter

j

, i.e.

T_{k}=\sum_{j=1}^{n}S_{jk}^{j},\tag{7.53}

or any other letter for that matter -- other than

k

, of course. Regardless, a contraction always involves one letter appearing twice, once as a superscript and once as subscript. The letter that represents the contracted indices and therefore appears twice is known as the repeated index.

7.7.1Contraction as a guide towards geometrically meaningful objects

The opposite-flavors requirement for a valid contraction is the cornerstone of the tensor notation. The fundamental reason for this requirement, which has to do with the tensor property, will have to wait until Chapter 14. On the other hand, the profound impact of this feature on our analytical framework will become apparent very quickly.

Recall that systems of order zero represent the ultimate goal of every analysis. Therefore, contraction, being the only operation that reduces the order of a system, must play a crucial role in producing systems of order zero. Furthermore, the opposite-flavors rule implies that every analysis must perfectly balance the number of covariant and contravariant indices. Imagine, for example, that we encountered the combination

U_{i}V_{j}

in the course of our analysis. Then, in order to eventually arrive at a geometrically meaningful result, the analysis must also include an additional ingredient -- with superscripts. This insight demonstrates that the opposite-flavors restriction greatly limits the totality of feasible combinations. Like all constructive constraints, it serves to sharpen our logic and to guide our explorations. While in other creative arenas, artists impose constraints upon themselves, Tensor Calculus and its notational system provide us with one from the very get-go.

We would also be quite remiss not to point out one more remarkable aspect of the guiding nature of the tensor notation. Experience shows that not only does the tensor notation limit the totality of feasible combinations, but also every combination that is feasible is meaningful. We will see this phenomenon time and again throughout our narrative. For example, consider the Riemann-Christoffel tensor

R_{\cdot jkl}^{i}

which plays in important role in Riemannian spaces. It has a mismatched number of superscripts and subscripts and therefore cannot validly produce a system of order zero. In order to balance the indices, let us use the aforementioned operation of index juggling to convert the second subscript into a superscript in order to produce the new system

R_{\cdot\cdot kl}^{ij}

. Finally, let us contract the resulting system on

i

and

k

, i.e. calculate the sum

\sum_{i=1}^{n}R_{\cdot\cdot il}^{ij}\tag{7.54}

for each

j

and

l

, and subsequently contract the result on

j

and

l

to produce a system of order zero

R

, i.e.

R=\sum_{j=1}^{n}\sum_{i,j=1}^{n}R_{\cdot\cdot ij}^{ij}\ \ \ .\tag{7.55}

We have never stated our rationale for these operations and perhaps we performed them simply because we could. Well, it turns out that the resulting quantity

R

, known as the scalar curvature, plays an important role in General Relativity.

If the foregoing discussion makes it seem that the tensor notation does all the work for us, rest assured that this is not the case. Tensor Calculus helps us express our ideas, and occasionally to guide them, but does not tell us which ideas to explore. Albert Einstein, an avid practitioner of Tensor Calculus, wrote to a friend while developing General Relativity: I have been laboring inhumanly. I am quite overworked. Perhaps in the absence of Tensor Calculus, he would have given up.

7.7.2Repeated indices

} } } As we have already mentioned, a contraction expressed with the help of the summation symbol, e.g.

T_{k}=\sum_{i=1}^{n}S_{ik}^{i}, \tag{7.52}

always features a pair of indices denoted by the same letter. It is called the repeated index, where the word index refers to the repeated letter.

The alternative terms are dummy index and, less commonly, summation index or contraction index. The word dummy makes sense because the repeated letter can be replaced with any other (that is not used for some other purpose) without altering the meaning of the expression. For example, when the letter

i

is replaced, say, with

r

, the summation

\sum_{i=1}^{n}S_{ik}^{i} \tag{7.52}

becomes

\sum_{r=1}^{n}S_{rk}^{r}.\tag{7.56}

The equivalence of the two expressions can be seen by unpacking, i.e.

\sum_{r=1}^{n}S_{rk}^{r}=S_{1k}^{1}+S_{2k}^{2}+S_{3k}^{3}=\sum_{i=1}^{n} S_{ik}^{i},\tag{7.57}

7.7.3Free indices

} The remaining indices, i.e. those not participating in a contraction, are called free or live indices. In a tensor expression, all free indices must be in perfect correspondence among all constituent parts of the expression. For example, consider the identity

T_{qr}^{p}=\sum_{i=1}^{n}A_{qi}B_{r}^{pi}+\sum_{i=1}^{n}C_{ri}^{p}D_{q}^{i}.\tag{7.58}

The indices

p

q

, and

r

are free and must therefore be present, with a consistent placement, in each term.

Free indices enumerate the elements of the resulting system. For example,

\sum_{i=1}^{n}A_{qi}B_{r}^{pi}\tag{7.59}

is a third-order system enumerated by

p

q

, and

r

. In a tensor equation, free indices enumerate the individual identities. For example, the equation

T_{qr}^{p}=\sum_{i=1}^{n}A_{qi}B_{r}^{pi}+\sum_{i=1}^{n}C_{ri}^{p}D_{q}^{i}. \tag{7.58}

represents

n^{3}

independent identities. One of those identities, i.e. the one corresponding to

p=1

q=2

, and

r=3

, reads

T_{23}^{1}=\sum_{i=1}^{n}A_{2i}B_{3}^{1i}+\sum_{i=1}^{n}C_{3i}^{1}D_{2}^{i}.\tag{7.60}

When fully unpacked, this identity becomes

T_{23}^{1}=A_{21}B_{3}^{11}+A_{22}B_{3}^{12}+A_{23}B_{3}^{13}+C_{31}^{1} D_{2}^{1}+C_{32}^{1}D_{2}^{2}+C_{33}^{1}D_{2}^{3}.\tag{7.61}

This example illustrates the difference between how repeated and free indices are unpacked. When a repeated index is unpacked, a contraction is expanded into an literal sum. When a free index is unpacked, a single expression (or equation) is replaced with a collection of expressions (or equations).

Much like dummy indices, free indices can be renamed as long as the name change is consistent across all terms. For example, in the equation

T_{qr}^{p}=\sum_{i=1}^{n}A_{qi}B_{r}^{pi}+\sum_{i=1}^{n}C_{ri}^{p}D_{q}^{i}, \tag{7.58}

the index

r

can be replaced with

s

, i.e.

T_{qs}^{p}=\sum_{i=1}^{n}A_{qi}B_{s}^{pi}+\sum_{i=1}^{n}C_{si}^{p}D_{q}^{i}.\tag{7.62}

In a more subtle maneuver, the letters

p

and

q

can trade places, i.e.

T_{pr}^{q}=\sum_{i=1}^{n}A_{pi}B_{r}^{qi}+\sum_{i=1}^{n}C_{ri}^{q}D_{p}^{i}.\tag{7.63}

This, too, leaves the meaning of the equation unchanged, as can be confirmed by unpacking.

Note that since the last three equations are equivalent, we would ordinarily assign to them the same equation number. However, the purpose of these particular equations was specifically to call attention to the different combinations of letters used for indices and thus assigning them distinct equation numbers make more sense.

7.7.4Contraction in combination with multiplication

Contraction often occurs in combination with multiplication. In The Absolute Differential Calculus, Levi-Civita refers to this combined operation as composition or inner multiplication of tensors. }

As a matter of fact, multiplication is almost always followed by contraction. For example, consider two first-order systems

A_{i}

and

B^{j}

. Their product

A_{i}B^{j}

is a mixed second-order system and can therefore be contracted to produce the number

A_{1}B^{1}+A_{2}B^{2}+A_{3}B^{3}.\tag{7.64}

You can probably already guess that this operation is in some way related to the dot product. This is indeed so and the dot product is, in fact, one of the underlying reasons for the existence of the operation of contraction.

For another important example, consider two second-order systems

A_{j}^{i}

and

B_{j}^{i}

. Their product

A_{j}^{i}B_{l}^{k}

is a fourth-order system. Let us contract this product on the covariant index of

A_{j}^{i}

and the contravariant index of

B_{j}^{i}

, i.e.

\sum_{j=1}^{n}A_{j}^{i}B_{l}^{j}=A_{1}^{i}B_{l}^{1}+A_{2}^{i}B_{l}^{2} +A_{3}^{i}B_{l}^{3}.\tag{7.65}

(This is the first example of the flexible renaming of indices we mentioned earlier: we introduced a system as

B_{j}^{i}

and immediately switched to

B_{l}^{j}

in the very next line. Despite the distinct indicial signatures,

B_{j}^{i}

and

B_{l}^{j}

refer to the very same system.) This contraction is, of course, related to matrix multiplication. If the elements of

A_{j} ^{i}

are organized into a matrix

A

and those of

B_{j}^{i}

are organized into a matrix

B

then the above contraction may be interpreted as the product

AB

, provided that for both systems the contravariant index is considered first and the covariant second.

7.7.5The Kronecker delta in a contraction

Recall that, in matrix terms, the Kronecker delta

\delta_{j}^{i}

corresponds to the identity matrix. Therefore, we might expect that the effect of contracting it with another system is to leave that system unchanged. This is indeed the case.

Consider a first-order system

U_{i}

and contract it with the Kronecker delta

\delta_{j}^{i}

. The result is the expression

\sum_{i=1}^{n}U_{i}\delta_{j}^{i}\tag{7.66}

with a dummy index

i

and a free index

j

. Let us analyze this expression by unpacking it, i.e.

\sum_{i=1}^{n}U_{i}\delta_{j}^{i}=U_{1}\delta_{j}^{1}+U_{2}\delta_{j} ^{2}+U_{3}\delta_{j}^{3}.\tag{7.67}

Observe that for each value of

j

, precisely one of the three terms on the right survives thanks to the special values of the Kronecker delta. For

j=1

the result is

U_{1}

, for

j=2

the result is

U_{2}

, and for

j=3

the result is

U_{3}

. In other words, for each

j

the result is

U_{j}

. In other words,

\sum_{i=1}^{n}U_{i}\delta_{j}^{i}=U_{j}.\tag{7.68}

This identity confirms that contraction with the Kronecker delta leaves the system unchanged, since

U_{i}

and

U_{j}

represent the same object. In terms of index manipulation, the Kronecker delta may be thought of as an index renamer: when contracted with

U_{i}

, the Kronecker delta

\delta_{j}^{i}

simply replaces the letter

i

with

j

7.8The summation convention

7.8.1A description of the convention

} } The summation convention, also known as Einstein's summation convention or Einstein's notation, was introduced by Albert Einstein in his celebrated 1916 work The Foundation of the Theory of General Relativity. Einstein observed that in virtually all scenarios, a repeated index emerges only in the context of a valid contraction. He therefore proposed to use a repeated index to signal a contraction, thus eliminating the need for the summation sign. For example,

A_{i}^{i}\text{ implies}\sum_{i=1}^{n}A_{i}^{i},\tag{7.69}

A_{i}B^{i}\text{ implies }\sum_{i,i=1}^{n}A_{i}B^{i}.\tag{7.70}

B_{ij}^{i}\text{ implies }\sum_{i=1}^{n}B_{ij}^{i},\tag{7.71}

and

C_{ij}^{ijk}\text{ implies }\sum_{i,i=1}^{n}C_{ij}^{ijk}.\tag{7.72}

In Einstein's own words: A glance at the equations of this paragraph shows that there is always a summation with respect to the indices which occur twice under a summation sign [...], and only with respect to indices which occur twice. It is therefore possible, without loss of clarity, to omit the sign of summation. In its place we introduce the convention: If an index occurs twice in one term of an expression, it is always to be summed unless the contrary is expressly stated.

Being essentially an abbreviation, the summation convention may seem like a minor notational convenience. The truth, however, is that it is surprisingly deep and its remarkable psychological and aesthetic effect on Tensor Calculus cannot be overstated. Consider the expression

\sum_{i=1}^{n}\sum_{j=1}^{n}\sum_{k=1}^{n}\varepsilon_{ijk}U^{i}V^{j} W^{k}\text{ \ \ \ or\ \ \ \ }\sum_{i,j,k=1}^{n}\varepsilon_{ijk}U^{i} V^{j}W^{k}\tag{7.73}

which, as we will later discover, represents the signed volume of a parallelepiped formed by three vectors. This example shows how, in the absence of the summation convention, an indicial expression -- or, even more so, a string of indicial identities -- can become dominated by summation signs to the point that the essential algebraic details are obscured. With the help of the summation convention, the same expression reads

\varepsilon_{ijk}U^{i}V^{j}W^{k}.\tag{7.74}

Thus, what is in actuality a sum of products, i.e.

\varepsilon_{111}U^{1}V^{1}W^{1}+\varepsilon_{112}U^{1}V^{1}W^{2} +\varepsilon_{113}U^{1}V^{1}W^{3}+\left( 24\text{ more terms}\right) ,\tag{7.75}

appears as a single product. Thus, at first, the summation convention may seem like an oversimplification that is likely to lead to errors. However, as we will discover shortly, it is algebraically sound to treat sums of products as simple products in a wide range of circumstances. Thus, the summation convention is not an oversimplification but, instead, a valid, albeit dramatic, simplification that serves to bring out the essence of algebraic ideas.

7.8.2Simultaneous contractions

The summation convention introduces potential ambiguities with respect to the order of contractions when more than one contraction is present in a single expression. Fortunately, all ambiguities prove to be immaterial as we are about to demonstrate.

Let us begin with the expression

A_{i}B^{i}+C_{i}D^{i}\tag{7.76}

featuring what appears to be a sum of two contractions, i.e.

\sum_{i=1}^{n}A_{i}B^{i}+\sum_{i=1}^{n}C_{i}D^{i}.\tag{7.77}

According to this interpretation,

A_{i}B^{i}+C_{i}D^{i}

translates to the sum

\left( A_{1}B^{1}+A_{2}B^{2}+A_{3}B^{3}\right) +\left( C_{1}D^{1} +C_{2}D^{2}+C_{3}D^{3}\right) ,\tag{7.78}

where the parentheses were strictly for visual grouping. However, an alternative interpretation of

A_{i}B^{i}+C_{i}D^{i} \tag{7.76}

is as a single contraction, i.e.

\sum_{i=1}^{n}\left( A_{i}B^{i}+C_{i}D^{i}\right) ,\tag{7.79}

which equals the sum

\left( A_{1}B^{1}+C_{1}D^{1}\right) +\left( A_{2}B^{2}+C_{2}D^{2}\right) +\left( A_{3}B^{3}+C_{3}D^{3}\right) ,\tag{7.80}

where, once again, the parentheses are present only for convenience. Since the two resulting sums produce identical results, the two interpretations are equivalent. Note that we will always prefer the former, more natural, interpretation.

For a second example, consider the expression

A_{ij}B^{ij}.\tag{7.81}

This expression can also be interpreted in two different ways depending on the order of summations, i.e.

\sum_{j=1}^{n}\left( \sum_{i=1}^{n}A_{ij}B^{ij}\right) \text{ versus } \sum_{i=1}^{n}\left( \sum_{j=1}^{n}A_{ij}B^{ij}\right) .\tag{7.82}

The fact that these expressions are equivalent can be easily confirmed by unpacking, as both expressions yield the sum of the nine terms

A_{11}B^{11}

A_{12}B^{12}

A_{13}B^{13}

A_{21}B^{21}

A_{22}B^{22}

A_{23}B^{23}

A_{31}B^{31}

A_{32}B^{32}

, and

A_{33}B^{33}

Far more subtle is the deceptively simple expression

A_{i}^{i}B_{j}^{j}.\tag{7.83}

The two contractions in this expression are completely independent of each other and it clearly does not matter which contraction is performed first. Thus, this expression represents a different kind of ambiguity, as it is unclear whether the contractions are performed before or after the multiplication. If the contractions are carried out first, the resulting unpacked expression is

\left( A_{1}^{1}+A_{2}^{2}+A_{3}^{3}\right) \left( B_{1}^{1}+B_{2} ^{2}+B_{3}^{3}\right) .\tag{7.84}

On the other hand, if the tensor product is carried out first and the intermediate expression

A_{r}^{i}B_{s}^{j}

(a fourth-order system with

81

elements) is subsequently contracted on

i

and

r

as well as

j

and

s

, the result is the sum of nine terms, i.e.

A_{1}^{1}B_{1}^{1}+A_{1}^{1}B_{2}^{2}+A_{1}^{1}B_{3}^{3}+A_{2}^{2}B_{1} ^{1}+A_{2}^{2}B_{2}^{2}+A_{2}^{2}B_{3}^{3}+A_{3}^{3}B_{1}^{1}+A_{3}^{3} B_{2}^{2}+A_{3}^{3}B_{3}^{3}.\tag{7.85}

As we can see, the two interpretations are equivalent, thanks to (a non-trivial application of) the distributive law.

I hope that these examples give insight into the multitude of issues that Einstein had to give careful consideration to before proposing his convention. In general, experience shows that ambiguities that arise as a result of the summation convention almost never lead to contradictions. We will now discuss some of the minor and easy-to-avoid exceptions to this general rule.

7.8.3Additional nuances of the summation convention

As we described above, the summation convention makes sums of products appear simply as products. Case in point is the classical contraction

U_{i}V^{i}\tag{7.86}

which looks like a product but it is, in fact, a sum of three products, i.e.

U_{i}V^{i}=U_{1}V^{1}+U_{2}V^{2}+U_{3}V^{3}.\tag{7.87}

As a result, we must be cognizant of the unpacked form of the expression especially when it appears as an argument to a nonlinear operator.

For example, a formal application of the power of a product rule to the expression

\left( U_{i}V^{i}\right) ^{2}

, i.e. the square of the quantity

U_{i}V^{i}

, yields

\left( U_{i}V^{i}\right) ^{2}=\left( U_{i}\right) ^{2}\left( V^{i}\right) ^{2} \tag{-}

which is clearly false. To see this, simply unpack both sides, i.e.

\left( U_{1}V^{1}+U_{2}V^{2}+U_{3}V^{3}\right) ^{2}=\left( U_{1}\right) ^{2}\left( V^{1}\right) ^{2}+\left( U_{2}\right) ^{2}\left( V^{2}\right) ^{2}+\left( U_{3}\right) ^{2}\left( V^{3}\right) ^{2},\tag{7.88}

to observe the falsehood. Similarly, the identity

\ln U_{i}V^{i}=\ln U_{i}+\ln V^{i} \tag{-}

is false.

On the other hand, when a contraction is subject to a linear operation, the formal treatment of a sum of products as if it were a simple product is valid. For example, imagine that the systems

U_{i}

and

V^{i}

are functions of a parameter

\alpha

and let

F\left( \alpha\right) =U_{i}\left( \alpha\right) V^{i}\left( \alpha\right) .\tag{7.89}

Then a formal application of the product rule yields the identity

\frac{dF}{d\alpha}=\frac{dU_{i}}{d\alpha}V^{i}+U_{i}\frac{dV^{i}}{d\alpha}.\tag{7.90}

Note that this identity is correct, as can be easily verified by unpacking the right side. This is left as an exercise where you will observe that the key to the correctness of the above identity is the linear property of the derivative, i.e.

\left( f+g\right) ^{\prime}=f^{\prime}+g^{\prime}.\tag{7.91}

We will find that in the overwhelming majority of situations that arise in Tensor Calculus, the summation convention, which treats sums of products as single products, is safe. When in doubt, however, we can always use the technique of unpacking in order to preclude mistakes.

7.8.4The identical meanings of contract with and multiply by

The phrases contract with and multiply by may be both used to describe the operation of multiplication followed by contraction. For example, consider the expression

A_{j}^{i}B^{j}.\tag{7.92}

When we say contract

A_{j}^{i}B^{j}

with

D_{i}

, we imply multiplying

D_{k}

, i.e.

A_{j}^{i}B^{j}D_{k},\tag{7.93}

and subsequently contracting on

i

and

k

, i.e.

A_{j}^{i}B^{j}D_{i}.\tag{7.94}

Alternatively, we may say multiply

A_{j}^{i}B^{j}

D_{i}

to describe the exact same operation since, in the presence of the summation convention, the repeated index

i

signals the subsequent contraction.

Since the order of multiplicative terms in a tensor product is irrelevant, we need not pay attention to the order of the terms when an expression is contracted with or multiplied by another system. For example, the result of contracting the identity

A_{j}^{i}B^{j}

with

D_{i}

can also be written as

A_{j}^{i}D_{i}B^{j}.\tag{7.95}

In the next Section, we will say a few more words about the order of multiplicative terms and the apparent conflict with the fact that matrix multiplication is noncommutative.

7.9The order of the multiplicative terms in a contraction is immaterial

Consider two second-order systems

A_{\cdot j}^{i}

and

B_{\cdot j}^{i}

, where we have used the dot placeholder to indicate that in both systems the superscript is the first index and the subscript is the second. As we have noted above, the contraction

A_{\cdot j}^{i}B_{\cdot k}^{j}\tag{7.96}

corresponds to the product of matrices associated with the two systems. Specifically, if the result of the above contraction is denoted by

C_{\cdot j}^{i}

, i.e.

C_{\cdot k}^{i}=A_{\cdot j}^{i}B_{\cdot k}^{j},\tag{7.97}

and

A

B

, and

C

are the matrices associated with

A_{\cdot j}^{i}

B_{\cdot j}^{i}

, and

C_{\cdot j}^{i}

, then

C=AB.\tag{7.98}

It is assumed that the reader is sufficiently proficient in matrix multiplication to recognize the clear correspondence between the last two equations. However, despite how clear this correspondence may be, it raises an interesting question.

As we well know, matrix multiplication is generally not commutative, i.e.

AB\neq BA,\tag{7.99}

and, therefore,

C\neq BA.\tag{7.100}

Meanwhile, the indicial expressions

A_{\cdot j}^{i}B_{\cdot k}^{j}\text{ and }B_{\cdot k}^{j}A_{\cdot j}^{i}\tag{7.101}

are completely equivalent. After all, the symbols

A_{\cdot j}^{i}

and

B_{\cdot j}^{i}

denote the individual elements of the respective systems and, being elementary numbers, clearly commute. Thus,

A_{\cdot j}^{i}B_{\cdot k}^{j}=B_{\cdot k}^{j}A_{\cdot j}^{i}.\tag{7.102}

Note that the fact that multiplication is followed by contraction does not change anything. If you still have any doubt, then unpacking the contractions should remove all doubt. Indeed,

\begin{aligned}B_{\cdot k}^{j}A_{\cdot j}^{i} & =B_{\cdot k}^{1}A_{\cdot1}^{i}+B_{\cdot k}^{2}A_{\cdot2}^{i}+B_{\cdot k}^{3}A_{\cdot3}^{i}\ \ \ \ \ \ \ \ \ \ \left(7.103\right)\\& \text{({\small exchange the individual numbers in each product})}\ \ \ \ \ \ \ \ \ \ \left(7.104\right)\\& =A_{\cdot1}^{i}B_{\cdot k}^{1}+A_{\cdot2}^{i}B_{\cdot k}^{2}+A_{\cdot3} ^{i}B_{\cdot k}^{3}=A_{\cdot j}^{i}B_{\cdot k}^{j}.\ \ \ \ \ \ \ \ \ \ \left(7.105\right)\end{aligned}

Therefore, the identity

C_{\cdot k}^{i}=A_{\cdot j}^{i}B_{\cdot k}^{j} \tag{7.97}

can be equivalently rewritten with the multiplicative terms on the right in the opposite order, i.e.

C_{\cdot k}^{i}=B_{\cdot k}^{j}A_{\cdot j}^{i}\ \ .\tag{7.106}

So, is there a contradiction between the fact that

AB\neq BA, \tag{7.99}

and the fact that

A_{\cdot j}^{i}B_{\cdot k}^{j}=B_{\cdot k}^{j}A_{\cdot j}^{i}? \tag{7.102}

Naturally, there is not. The key lies in how the underlying elementary arithmetic operations are encoded in each notational system. In a matrix product

C=AB

, the matrices

A

and

B

play unequal roles: the columns of

C

are the linear combinations of the columns of

A

with the coefficients supplied by the columns of

B

. In the product

A_{\cdot j} ^{i}B_{\cdot k}^{j}

, on the other hand, the objects

A_{\cdot j}^{i}

and

B_{\cdot k}^{j}

represent individual and may therefore appear to have equal roles. However, as systems,

A_{\cdot j}^{i}

and

B_{\cdot k}^{j}

do not participate equally in the contraction since the summation takes place on the second index of

A_{\cdot j}^{i}

and the first index of

B_{\cdot k}^{j}

. In other words, the underlying elementary arithmetic operations encoded by the order of terms in a matrix product, are achieved by the interplay among the indices, rather that the order of the multiplication terms, in an indicial expression.

Interestingly, the tensor notation proves to be more economical than the language of Matrix Algebra in terms of the number of primary operations and rules. As we just observed, one need not pay attention to the order of the multiplicative terms in a product. Furthermore, we will discover that the concept of the transpose can also be captured by the interplay among the indices in a contraction. As we have already noted, the trace can also be represented by a contraction. As a result, the operator set in the tensor notation is surprisingly small as it consists only of addition, multiplication, and contraction. This aspect of the tensor notation is appealing, but it would be a mistake to conclude that the tensor notation is "better" than the language of Matrix Algebra in some absolute sense. Each system has areas where it offers advantages over the other. Generally speaking, the tensor notation is superior when access to individual elements of systems is required. The language of Matrix Algebra is superior when the focus is on the operators as a whole and their algebraic properties. These issues will be explored in greater detail in Chapter 19.

7.10Invalid tensor expressions

There are several reasons why an indicial expression may be considered invalid. First, it may include incompatible indicial signatures, as in the sum

U_{i}+V_{ij}. \tag{-}

Second, it may be ambiguous, such as the "contraction"

A_{i}B^{i}C_{i}D^{i}. \tag{-}

Such an expression may arise as a result of multiplying the expressions

A_{i}B^{i}\text{ and }C_{i}D^{i}\tag{7.107}

while forgetting to change the name of the repeated index in one of them. The proper way to express the result of multiplying

A_{i}B^{i}

C_{i}D^{i}

is, of course,

A_{i}B^{i}C_{j}D^{j}.\tag{7.108}

Similarly, in order to express the expression

\left( U_{i}V^{i}\right) ^{2}

as a product of

U_{i}V^{i}

with itself, a new index needs to be introduced, i.e.

\left( U_{i}V^{i}\right) ^{2}=U_{i}V^{i}U_{j}V^{j}.\tag{7.109}

Having to rename some of the indices in preparation for combining expressions in a product is very common in Tensor Calculus.

Third, a combination may be invalid simply because it can never arise in the course of a legitimate analysis, such as the sum

U^{i}+V^{j}. \tag{-}

On a technical level, this expression can be interpreted as a second-order system consisting of

n^{2}

pairwise sums of the individual elements of

U^{i}

and

V^{j}

. Nevertheless, this does not change the fact that we will simply never encounter such an expression.

The final reason, which may appear arbitrary at this point but is actually at the very heart of Tensor Calculus, is that some operations do not preserve the tensor property. Expressions that violate this rule include the addition of systems with mismatched indicial signatures, such as

U_{i}+V^{i}. \tag{-}

Another example is a forced summation on two like-flavored indices, such as

\sum_{i=1}^{n}A_{ii} \tag{-}

\sum_{i=1}^{n}U_{i}V_{i}. \tag{-}

The invalidity of the last two expressions maybe surprising to some readers since the former corresponds to the trace of a matrix while the latter corresponds to the dot product with respect to an orthonormal basis -- two very common and meaningful operations. Nevertheless, these expressions are indeed invalid in the context of Tensor Calculus which holds expressions to the higher standard of geometric meaningfulness. As we will discover, operations that do not preserve the tensor property, do not produce geometrically meaningful objects and therefore do not meet this standard.

This completes our discussion of the basic elements of the tensor notation. However, before we return to the study of differential objects in Euclidean spaces, it will behoove us to dwell a little longer on the notation itself. To this end, the next Chapter will describe a few elementary applications that illustrate the power of the tensor notation. In doing so, the next Chapter will not only serve to further your familiarity with the notation but will also introduce a number of important techniques directly relevant to the discussions that follow.

7.11A brief historical note

The evolution of the tensor notation continued long after the main ideas of the subject had already been formulated. For example, the use of superscripts to indicate the manner in which objects transform under a change of coordinates is already present in Gregorio Ricci and Tullio Levi-Civita's original 1901 paper M\'{ethodes de calcul diff'{e}rentiel absolu et leurs applications}. However, the use of superscripts was somewhat tentative as they were always found in parentheses, e.g. the symbol

Y^{\left( r_{1}r_{2}\cdots r_{n}\right) }

in the M\'{ethodes} equation (7) on page 151, so as to not confuse them with exponentiation. Furthermore, the coordinates themselves are notated with subscripts. Even in 1925, in his The Absolute Differential Calculus, Levi-Civita writes The indices of contravariance are generally written above, those of covariance below; an exception is however made for the variables

x

, which are as usual denoted by

x_{1}

x_{2}

\cdots,x_{n}

with the indices below... Albert Einstein, in The Foundation of the Theory of General Relativity, also enumerates coordinates with subscripts although his use of superscripts is otherwise the same as ours.

This is only to say that scientific ideas and the languages that describe them tend to be developed in an interdependent manner where, at times, the ideas determine the language and, at other times, the language inspires new ideas.

7.12Exercises

Exercise 7.1List the twelve elements of the system represented by the symbol

B_{\alpha\beta}^{i}

Exercise 7.2Capture the sum

A_{11}x^{1}y^{1}+A_{12}x^{1}y^{2}+A_{21}x^{2}y^{1}+A_{22}x^{2}y^{2}\tag{7.110}

in the tensor notation.

Exercise 7.3For a fourth-order system

S_{kl}^{ij}

, do the contractions

S_{ij}^{ij}\tag{7.111}

and

S_{ji}^{ij}\tag{7.112}

yield the same result? In order to answer this question, fully unpack each expression in a three-dimensional space.

Exercise 7.4Show by unpacking that the identity

A_{ij}=-A_{ji}\tag{7.113}

implies that

A_{ij}

corresponds to a skew-symmetric matrix.

Exercise 7.5In an

n

-dimensional space, how many relationships does the equation

A_{ijk}=-A_{kij}\tag{7.114}

represent? Show that from this equation it follows that all elements of

A_{ijk}

are zero.

Exercise 7.6Show by a manipulation of indices that the expressions

A_{ij}B_{k}^{i}C^{jk}

and

A_{jk}B_{i}^{j}C^{ki}

are equivalent.

Exercise 7.7For a system

A_{\cdot j}^{i}

that corresponds to the matrix

\left[ \begin{array} {rrr} -4 & 0 & 3\\ 6 & -1 & -6\\ -6 & 0 & 5 \end{array} \right] ,\tag{7.115}

calculate the quantities

A_{i}^{i},\ \ A_{j}^{i}A_{i}^{j},\ \ A_{i}^{i}A_{j}^{j},\ \ \ A_{j}^{i} A_{k}^{j}A_{i}^{k},\ \ \text{and}\ \ A_{j}^{i}A_{k}^{k}A_{i}^{j}.\tag{7.116}

Exercise 7.8In a three-dimensional space, i.e.

n=3

, fully unpack the equation

A_{\cdot j}^{i}x^{j}=b^{i}.\tag{7.117}

Exercise 7.9In a three-dimensional space, fully unpack the expression

U_{kj}^{ij}

. For example, the element corresponding to the values

i=1

and

k=2

of the live indices is

U_{21}^{11}+U_{22}^{12}+U_{23}^{13}.\tag{7.118}

Exercise 7.10For two functions

U_{i}\left( \alpha\right)

and

V^{i}\left( \alpha\right)

, demonstrate the identity

\frac{d\left( U_{i}V^{i}\right) }{d\alpha}=\frac{dU_{i}}{d\alpha}V^{i} +U_{i}\frac{dV^{i}}{d\alpha} \tag{7.90}

from Section 7.8.3. Recall that this identity shows that the product rule can be applied to a contraction of a product as if it were a simple product.

Exercise 7.11Show that in an

n

-dimensional space,

\delta_{i}^{i}=n.\tag{7.119}

Exercise 7.12Show that

S_{k}^{ij}\delta_{i}^{l}=S_{k}^{lj}.\tag{7.120}

Exercise 7.13Show that

S_{k}^{ij}\delta_{i}^{r}\delta_{j}^{s}\delta_{t}^{k}=S_{t}^{rs}.\tag{7.121}

Exercise 7.14Show that

\delta_{j}^{i}\delta_{k}^{j}=\delta_{k}^{i}.\tag{7.122}

Exercise 7.15Show that

\delta_{j}^{i}\delta_{k}^{j}\delta_{l}^{k}=\delta_{l}^{i}.\tag{7.123}

Exercise 7.16Show that in the

n

-dimensional space,

\delta_{j}^{i}\delta_{k}^{j}\delta_{i}^{k}=n.\tag{7.124}

Exercise 7.17Show that in the

n

-dimensional space,

\delta_{i}^{i}\delta_{k}^{j}\delta_{j}^{k}=n^{2}.\tag{7.125}

Exercise 7.18Show that

\left( \delta_{i}^{k}\delta_{j}^{l}-\delta_{i}^{l}\delta_{j}^{k}\right) S_{k}^{i}S_{l}^{j}=S_{i}^{i}S_{j}^{j}-S_{j}^{i}S_{i}^{j}.\tag{7.126}

Exercise 7.19Show that the system

S_{ij}=T_{ij}+T_{ji}\tag{7.127}

is symmetric, i.e.

S_{ij}=S_{ji}.\tag{7.128}

Exercise 7.20Show that the system

S_{ij}=T_{ij}-T_{ji}\tag{7.129}

is skew-symmetric, i.e.

S_{ij}=-S_{ji}.\tag{7.130}

Exercise 7.21If

S_{ij}

is skew-symmetric, i.e.

S_{ij}=-S_{ji},\tag{7.131}

show that

S_{ij}U^{i}U^{j}=0.\tag{7.132}

Exercise 7.22Conversely, show that if the latter identity holds for any

U^{i}

, then

S_{ij}

is skew-symmetric.

Problem 7.1Show that a symmetric system

A_{ij}

, i.e.

A_{ij}=A_{ji},\tag{7.133}

in an

n

-dimensional space has at most

n\left( n+1\right) /2

distinct elements. The maximal number of distinct elements in a system is known as the number of degrees of freedom. Although it is not central to our narrative, let us generalize this calculation to systems of arbitrary order. Suppose that

A_{ijk}

is a symmetric third-order system. In other words, any two elements of

A_{ijk}

related by a permutation of indices are equal. For example,

A_{123}=A_{231}

. Show that

A_{ijk}

has

\binom{n+2}{3}=\frac{n\left( n+1\right) \left( n+2\right) }{6}\tag{7.134}

degrees of freedom. In general, show that a symmetric system of order

m

has

\binom{n+m-1}{m}=\frac{\left( n+m-1\right) !}{m!\left( n-1\right) !}\tag{7.135}

degrees of freedom.

The Basic Elements of the Tensor Notation

7.1The use of indices

7.2The order of the indices

7.3The Kronecker delta δji\delta_{j}^{i}δji​

7.4The technique of unpacking

7.5Linear combinations of systems

7.6Multiplication of systems

7.7Contraction

7.7.1Contraction as a guide towards geometrically meaningful objects

7.7.2Repeated indices

7.7.3Free indices

7.7.4Contraction in combination with multiplication

7.7.5The Kronecker delta in a contraction

7.8The summation convention

7.8.1A description of the convention

7.8.2Simultaneous contractions

7.8.3Additional nuances of the summation convention

7.8.4The identical meanings of contract with and multiply by

7.9The order of the multiplicative terms in a contraction is immaterial

7.10Invalid tensor expressions

7.11A brief historical note

7.12Exercises

7.3The Kronecker delta $\delta_{j}^{i}$