The opening sentence in J.J. McConnell's classic Applications of Tensor Calculus perfectly
captures the central role of the tensor notation in our subject: The notation of the absolute
differential calculus or, as it is frequently called, the tensor calculus, is so much an integral
part of the calculus that once the student has become accustomed to its peculiarities he will have
gone a long way towards solving the difficulties of the theory itself.
To elaborate on McConnell's statement, the tensor notation is capable not only of expressing
the ideas of Tensor Calculus but, in fact, of guiding them, and the degree to which this
influence is observed is unusual, if not unique, in Mathematics. The rules of the tensor notation
make working with expressions in some ways akin to putting together a jigsaw puzzle: one
immediately knows when pieces do not go together and what other pieces to look for. A vast majority
of combinations, that would otherwise deserve consideration, are eliminated due to
incompatibilities exposed by the notation. Meanwhile, almost any operation that is valid is
worthwhile and is virtually guaranteed to play an important role in some analysis.
Ideally, the presentation of the various elements of the tensor notation would take place at the
same time as the fundamental ideas of the subject itself. However, new concepts in Tensor Calculus
tend to emerge with such rapidity and rely on the tensor notation to such an extent, that it is of
great advantage to be already familiar with the key elements of the notation when studying the
concepts. And even though some elements of the notation may arise out of context and therefore
appear arbitrary, introducing the notation in advance of the core ideas that it is designed to
express is worth it. Also, keep in mind that, a number of elements of the tensor notation as well
as a number of core concepts will not be justified anyway until we introduce the tensor
property in Chapter 14.
7.1The use of indices
In the coordinate space, geometric objects are represented by what is best described, in
programming jargon, as multi-dimensional arrays. The elements of the array may be either
numbers or geometric vectors. The terms tensor notation and indicial notation are
used interchangeably to describe the use of indices to enumerate such arrays. We will refer
to indexed multi-dimensional arrays as systems, particularly when the focus is solely on the
data and the manner in which it is organized, rather than its potential to the underlying geometry.
In the future, when the focus shifts to the relationship between the data and the geometric objects
that they represents, we will switch from the term variant to highlight the dependency of
the coordinate representation on the choice of coordinates.
Systems will be denoted by letters followed by a collection of indices -- for instance, ,
, ,
,
,
, and
. A
distinctive feature of the tensor notation is the use of superscripts alongside
subscripts. We first encountered this feature in the previous Chapter where a superscript
was used to enumerate coordinates. The use of superscripts may be unexpected to anyone encountering
it for the first time since superscripts are usually associated with exponentiation. Be it as it
may, the use of superscripts is the foundation of the tensor notation and is an important
ingredient of its elegance and power. Much of Tensor Calculus is about the interplay between two
opposing tendencies -- captured by the two different types of indices -- working together, in
balance, to preserve the underlying geometry in coordinate space representations. Thus, the
simultaneous use of subscripts and superscripts reflects the very nature of the subject itself.
Whether an index is a superscript or a subscript is referred to as its placement or
flavor. A superscript is also known as a contravariant index and a
subscript as a covariant index. The collection of indices associated with a system is
known as its indicial signature. Systems that feature both superscripts and subscripts are
called mixed. The role of the dot in the indicial signature of the symbol
will be explained in Section 7.2.
As we already mentioned in Chapter 6 and will study
in great detail later, the placement of indices is a means of indicating the manner in which the
system transforms under a change of coordinates. For example, as we noted in Section 6.6, affine coordinates and the associated coordinate basis
transform by opposite rules. This example
foreshadows the central concept of tensors, i.e. systems that transform according to a set
of two opposite rules known as covariant and contravariant. The covariant rule
describes systems that change in the same manner as the coordinate basis. The contravariant
rule describes systems that change in the opposite manner. Thus, a subscript signals the covariant
nature of the system while a superscript signals the contravariant nature. Some systems, such as
,
transform according a combination of these rules. Although the precise definitions of
covariant, contravariant, and tensor will not be given until Chapter 14, we will use these terms in describing various
objects in anticipation of a future clarification.
It is a remarkable aspect of the tensor notation that the placement of indices not only
reflects the manner in which a system transforms under a change of coordinates, but can also
predict it. On frequent occasions, we will choose a particular placement for an index not
because we know how the corresponding system transforms under a change of coordinates, but because
the rules of the tensor notation dictate that particular placement. Without exception, every
placement chosen in that fashion will prove to be correct in the sense of accurately predicting the
transformation properties of the system.
The total number of indices associated with a system is known as its order. For example,
is a
first-order system with scalar elements, and
are
first-order systems with vector elements, is
a second-order system with scalar elements,
is a third-order system with scalar elements, and
is a fourth-order system with scalar elements.
A system of order zero is a single number or vector but, despite their utter simplicity, they are
the ultimate goal of every analysis since, as we will soon discover, only systems of order zero
can correspond to meaningful geometric quantities.
An indexed symbol, say ,
may refer either to the system as a whole or to one of its individual elements. For example, we may
say
describes curvature, referring to the overall system. Or we may say
is the
component of the vector ,
in which case we are clearly referring to the individual element. Finally,
may refer to the individual elements collectively, in which case we will use the plural form, e.g.
are the components of the vectors
with respect to the basis .
By convention, the values of Latin indices range from to the dimension of the space. For example, in the three-dimensional
space, the symbol
represents numbers ,
, and
.
Meanwhile, the symbol
represents numbers , , , , , , , , and
. In
general, a system of order represents
elements.
The elements of a first-order system, such as , can
be naturally arranged into a column or a row matrix. Meanwhile, the elements of a second-order
system, such as ,
can be naturally arranged into a matrix. From a certain point of view, the correspondence between
second-order systems and matrices is entirely straightforward. On the other hand, the many
differences between the expressive capabilities of the tensor notation and the language of matrices
lead to numerous subtleties that will be addressed throughout our narrative, as well as in Chapter
19 which deals with the interplay between Linear
Algebra, matrices, and Tensor Calculus. Note, however, that Tensor Calculus does not need
the matrix perspective. The entire subject can be presented strictly in the tensor notation without
making a single reference to matrices or matrix algebra. Nevertheless, whenever the aforementioned
subtleties can be either avoided or easily clarified, we will freely rely on the correspondence
between first- and second-order systems and matrices if it helps to elucidate a point or to
facilitate a calculation.
By convention, the value of the first index of a second-order system indicates the
row of the element in the matrix while the second index indicates the column.
Thus, the usual way to arrange the elements of
into a matrix is
As we have already stated, such a
correspondence between a second-order system and a matrix is completely natural. However, in order
to stimulate our tensorial way of thinking, we will always wish to stress the distinction
between systems and matrices. We will therefore avoid using the equals sign between systems
and matrices and will instead prefer to write
The elements of a first-order system, such as , can
be arranged either into a column or a row matrix. There is no universal convention as to which
arrangement is preferred. However, for the sake of specificity, we will always use the column
arrangement, i.e.
Finally, there are no set conventions for displaying higher-order systems in a matrix-like manner.
For example, a system ,
which consists of numbers, can be imagined either as a three-dimensional
array of numbers, or as three matrices, or as a table of column matrices. None of these arrangements are
particularly useful and perhaps it is best to think of
simply as a triple-indexed set of numbers not arranged in any particular way.
Greek letters are used alongside Latin letters when two spaces of different dimensions are
simultaneously analyzed. This occurs most commonly in the study of curves and surfaces embedded in
a higher-dimensional space. In those situations, Latin indices typically correspond to the higher
dimension of the surrounding space while Greek indices correspond to the dimension of the embedded
object. A single system can have both Latin and Greek indices. The shift tensor ,
which will feature in a future book, is an example of such a system. For a two-dimensional surface
embedded in a three-dimensional space, the symbol
represents the numbers ,
,
,
,
,
and .
For the same dimensions, a third-order system
consists of numbers.
As we have already discussed, we will also consider systems whose elements are geometric vectors.
For example, in the three-dimensional space, the symbol
represents the three vectors , , and
.
Meanwhile, for a two-dimensional surfaces embedded in a three-dimensional space, the symbol
represents the pair of vectors , . In
the same scenario, the symbol with
one Latin and one Greek index represents six vectors: ,
,
,
,
,
and .
The use of indices enables us to distinguish symbols by their indicial signature. As a
result, we can reuse the same letter for closely related objects and rely on the indicial signature
to help us tell them apart. Ultimately, this practice leads to greater clarity and economy of
letters. The symbols ,
, , and
all denote distinct objects with no possibility of confusion. Furthermore, symbols, such as and
, for
two different systems, one consisting of scalars and the other of geometric vectors, can be denoted
by the same letter with the same indicial signature since they are still
distinguishable by the weight of the font.
The reader should embrace the indicial nature of the tensor notation. Indices make tensor
expressions intuitive, dynamic, and beautiful. You may one day come across a frequently
misinterpreted quote by Elie Cartan, the inventor of Differential Forms, in which he advised to
as far as possible avoid very formal computations where an orgy of tensor indices hides a
geometric picture which is often very simple. Cartan was referring to a particular application
related to Differential Forms where an alternative non-indicial framework offers a number of
advantages. Had Cartan known that his statement will be used as an argument against the tensor
notation, he would have certainly rephrased it. Indeed, Cartan's masterpiece Riemannian Geometry
in an Orthogonal Frame, from which the quote is taken, is filled with beautiful tensor
calculations in the indicial notation.
7.2The order of the indices
For systems of order greater than one, the order of the indices needs to be clear. In other words,
we must indicate which index is first, second, and so on. In a system denoted by the symbol ,
the order of the indices is clear: is first and is second. However, in a mixed
system, such as ,
it is unclear whether is first and is second or the other way around.
For higher-order systems, the ambiguity is even greater. For a system such as ,
it is clear that comes before and comes before , but that still leaves six possible
orderings.
Agreeing on the order of the indices is important for two reasons. First, it is clearly necessary
when representing systems by matrices. The more compelling reason, however, has to do with the
essential practice of index juggling which will be introduced in Chapter 11. Without prematurely getting into the details, we
note that under index juggling, any index can change its flavor: a subscript can become a
superscript and vice versa. As a result, the symbol
becomes ambiguous, as it can either denote the "original" system or the new one where the
superscript became a subscript and the subscript became a superscript. Fortunately, this ambiguity
can be avoided by clearly establishing the order of the indices which can be done with the help of
notation.
In order to indicate the order of the indices, each index is allocated its own vertical space. If
the index is a subscript, the superscript slot will remain open, and if the index is superscript,
the subscript slot will remain open. For example, if the intended order of the indices in
is , then the
symbol may be written as
The only shortcoming of this
notation is that blank spaces do not have reliable widths, especially in print. It is therefore
better to let a placeholder, such as a dot "", occupy the empty spot in the space assigned to a
particular index. With the help of such a placeholder, the above symbol will appear in the form
Then, if, say, the subscript were to be converted into a
superscript by index juggling, the new symbol would become
where the order of the indices
remains clear. Incidentally, note that the last dot -- e.g. the third dot on the bottom in the
indicial signature of
-- can always be safely omitted, yielding the symbol ,
since it remains clear that is the fourth index. Similarly, in
the symbol ,
the last two dots in the top row of the indicial signature can be omitted, yielding .The
most prominent symbol that commonly features the dot placeholder is the Riemann-Christoffel tensor
,
which will first appear in Chapter 15 and
subsequently take center stage in Chapter 20.
7.3The Kronecker delta
We will now introduce the ubiquitous Kronecker delta symbol -- or the Kronecker delta,
for short. The Kronecker delta is denoted by the symbol
with one contravariant and one covariant index. Ultimately, this placement of indices will not only
make sense, but will seem inevitable. In particular, the symbol
will never be used.
By definition, the elements of the Kronecker symbol equal when the two indices have the same value and otherwise, i.e.
When elements of
are organized into a matrix, the result is, of course, the identity matrix. For example, in three
dimensions
The Kronecker delta is a mixed system, i.e. it features both subscripts and superscripts.
Therefore, as we discussed in the previous Section, we ought to be precise with regard to the order
of its indices. In the case of the Kronecker delta, however, the resulting matrix is symmetric and
thus the order of the indices does not matter.
The Kronecker delta has many uses in addition to those that arise from its interpretation as the
identity matrix. We will now describe one such application that appears frequently in various
analyses, including quadratic form minimization discussed in the next Chapter. It centers around
the simple equation
whose meaning we are about to describe.
Consider functions , , and of three variables ,
, and
. For
example, we could consider quite complicated functions, such as
Instead, we will consider functions of utmost simplicity -- namely,
For the sake of greater organization, we will switch from the letters , , and to the symbols ,
, and
, i.e.
Let us evaluate the partial derivatives of each of these functions with respect to each of the
independent variables. We find
With the help of the Kronecker delta, these nine equations can be captured by a single identity,
i.e.
Take a moment to interpret this
identity for various values of and to confirm that each of the nine
preceding equations is perfectly represented.
When the symbol is
replaced with the actual function that it represents, i.e. , we
arrive at the identity that we set out to derive, i.e.
In summary, this equation makes
sense if the symbol in
the "numerator" is interpreted as a function (of the three variables ,
, and
)
while the symbol
in the "denominator" is interpreted as one of the independent variables.
Observe the correspondence in placement between the indices on both sides of the equation
The index is a superscript on both sides. Meanwhile, the index
is a superscript on the left but a
subscript on the right. However, on the left, it appears in the "denominator" and, therefore, can
be thought of as being a lower index, i.e. a subscript. This is obviously not a rule
based in logic, but rather a notational device for remembering how to place indices that emerge as
a result of differentiation. This choice of placement will be vindicated in Chapter 14 where we will show that objects that emerge as a
result of partial differentiation actually transform by the rule predicted by the placement of the
index as a subscript.
7.4The technique of unpacking
The role of indices is simple and straightforward: to enumerate objects. Therefore, which specific
letter (from an appropriate alphabet) represents the index is unimportant and is purely a matter of
preference. For example, consider the symbol which
represents the numbers ,
, and
. The
very same set of numbers could also be represented by the symbols ,
,
,
,
and so on. Thus, in order to master the tensor notation, one must desensitize oneself to the
particular letter choice in an expression and learn to see the actual object that is being
represented. For example, for the symbols ,
,
,
,
,
and so on, one must imagine the three numbers ,
, and
that
each of these symbols represents. This practice can be described as unpacking the
expression. If two indicial expressions are equivalent in the unpacked form, then they are also
equivalent in the indicial form.
The freedom to choose any combination of letters for indices allows a great deal of flexibility in
the use of indicial expressions. For example, an object may be introduced as in
one line and then appear as
in the very next. A second-order system
may appear as in
another expression and the fact that the two indices are switched does not signal that the system
has been transposed in some matrix sense (unless
and
appear in the same expression -- more on that in a moment). By the same token, the equations
and
are completely equivalent. As
a result of this equivalence, the two equations share the same equation number (7.23). Throughout
our narrative, we will adhere to the practice of assigning the same equation number to equivalent,
albeit differently indexed, equations -- that is, unless the indices were specifically renamed for
the purpose of the analysis.
A moment ago, we mentioned that the symbols
and
represent the same system rather than two different systems related by the matrix transpose. This
observation is at the root of the many subtleties that we have alluded to earlier related to the
correspondence between systems and matrices. In order to illustrate the equivalence of the symbols
and
let us, once again, resort to unpacking. Suppose
corresponds to the matrix
For each symbol,
and ,
consider, for example, the element that corresponds to the first index value of and the second index value of . In both cases, we obtain the same value of . Of course, it was key that we referred to the indices
not by their names but by their order in the symbol. This is the proper interpretation intended by
the tensor notation. And, according to this intended interpretation, we conclude that the symbols
and
represent the same system.
Had we instead considered the element that corresponds to and , we would have recovered the value of for
and the value of for .
As a result, we would have incorrectly concluded that
and
represent distinct systems (related by the matrix transpose). The error, of course, lies in
referring to indices by their letter names which implies a linkage between the symbols
and
that does not exist -- unless both symbols are engaged in a single equation.
To illustrate that possibility, consider the identity
In this situation, the values of the
indices cannot be assigned independently for
and .
In coordinating the values of indices within an equation, we must refer to indices by their names.
If, for example, we let and , then these values must be applied to all symbols in the
identity consistently. For these particular values, the identity
tells us that
Thus, the interplay between
like-named indices in the identity
tells us that the system represented by is
symmetric in the matrix sense. Such a system
may, for example, correspond to the symmetric matrix
Similarly, the identity
relating two second-order systems
and
implies that the matrices representing these systems are the transposes of each other. For example,
if
then
Sophisticated indicial expressions can sometimes overwhelm our algebraic intuition. Thus, it is
very important to take one's time in the early stages of learning the tensor notation and to make
sure that every relationship is thoroughly understood before moving on. The technique of unpacking
described in this Section can go a long way in assisting the reader in this endeavor.
7.5Linear combinations of systems
One of the strengths of the indicial notation is its highly restricted set of operations. One of
those operations is the addition of systems with identical sets of like-named contravariant and
covariant indices. For example, the sum of
two first-order systems and
is
given by
When unpacked, we find that this
identity represents three equations, i.e.
In other words, as expected, systems are added in element-wise fashion.
The same is true for higher-order systems. For example, two second-order systems
and
can be added to produce another second-order system, i.e.
The indices need not appear in the
same order in each term. We could also evaluate the sum
where the order of the subscripts in
the second system is switched. If we associate ,
,
and
with matrices , , and , then we may interpret the first
equation as and the second as .
Such correspondence between indicial and matrix expressions will be discussed in detail in Chapter
19 but, once again, keep in mind that for the
purposes of Tensor Calculus, it is not necessary to translate indicial expressions into matrix
terms.
Along with addition, systems are also subject to multiplication by numbers and vectors. Using
second-order systems as an example, a system
with scalar elements can be multiplied by a number , resulting in the system ,
where each element of is
multiplied by . Similarly,
can be multiplied by a vector , resulting in ,
where each element of is
multiplied by . Similarly, a system
with vector elements can be multiplied by a scalar or dot-multiplied by a vector , resulting in the systems
and .
The term linear combination refers to expressions that combine multiplication by numbers and
addition, e.g.
or
In summary, the operations of addition and multiplication by individual numbers and vectors are
defined exactly as we would expect by analogy with matrices. We will now turn our attention to the
more interesting tensor multiplication of systems.
7.6Multiplication of systems
The result of multiplying two systems is precisely what the notation suggests. For example,
i.e. the result of multiplying
and , is a
fourth-order system that represents all possible pairwise products of elements from
and . This
operation is known as tensor multiplication or the tensor product,
The indicial signature of a tensor product is the combination of the signatures of each factor. For
example, the above product could be denoted by ,
i.e.
Once again stepping into the realm
of the subtleties of the tensor notation, the same product could also be denoted by the symbol
featuring a different order of indices. The choice of the order of indices in the resulting system
is entirely up to the analyst.
Tensor multiplication is commutative, i.e.
This is obviously so since the
symbols
and
represent the individual elements of each system. Being plain numbers, the individual
elements commute according to the rules of elementary arithmetic. This trivial fact is worth
pointing out since we will soon discover that tensor multiplication, in combination with (the
about-to-be-defined) contraction, is capable of expressing matrix multiplication which is
famously noncommutative.
Finally, we note that tensor multiplication rarely takes place without actually being followed up
by contraction, so let us now turn our attention to that operation.
7.7Contraction
For a system of order greater than or equal to , contraction refers to a summation over a pair of
indices which leads to the reduction of the order of the system by . For example, contracting the third-order system
on and yields a first-order system given
by
Thus, contraction can take place
only on indices that change over identical ranges. In other words, a system ,
where changes from to and changes from to , cannot be contracted.
Furthermore, it is stipulated that the contracted indices must be of opposite flavors, i.e.
one of the indices must be a superscript and the other a subscript. This requirement is the
hallmark of a valid contraction and is the key to the guiding ability of the tensor notation. In
fact, it will guide us extensively over the next few chapters in introducing new objects and
establishing key relationships. However, the logical justification for this requirement based on
the concept of a tensor will not be given until Chapter 14. Fortunately, we can wait this long precisely because the rules of the
tensor notation assure the correctness of one's analysis even when one is not aware of their
logical underpinning.
Before we give several generic examples of valid contractions, let us review a few instances of
summations that, although they violate the opposite-flavors requirement, nevertheless demonstrate
that summations over pairs of indices play an omnipresent role in investigations related to
Geometry. The reason why these examples violate the opposite-flavors requirement lies simply in the
fact that we have not yet adjusted the placements of the indices so as to reflect the true nature
of the corresponding objects. Once that adjustment as made, these summations will become valid
contractions.
For the first example, note that the classical decomposition of a vector with respect to a basis,
i.e.
is, in fact, a summation of the
product
over the pair of the available indices resulting in a system of order zero . Secondly, the dot product of two vectors
represented by the double sum
is a series of two summations of the
same kind. It can also be referred to as a double summation since the two summations can be
performed in any order and, in fact, simultaneously, which is why the form
with a single summation sign is
often used. We may describe this expression is a double (albeit, invalid) contraction of the system
on
and as well as and . When the basis is
orthonormal, the double contraction becomes a single (still invalid) contraction of the system
on
and , i.e.
Finally, the trace of a
matrix, i.e. the sum of its diagonal elements, is another example of a summation over a pair of
indices. For example, the trace of the matrix corresponding to the system is
the number
As you can see from these examples,
summations over pairs of indices reduce sets of
elements to a single value.
Let us know give a number of generic examples of valid contractions. Let us stay in three
dimensions where each index assumes the values , , and . Then, for a mixed second-order system ,
the only possible contraction is the sum of the elements that correspond to matching values of its
two indices, i.e.
Of course, if is
thought of as a matrix, then the contraction corresponds to its trace.
For a higher-order system, a contraction takes place over a particular superscript-subscript pair
and is performed for every combination of the remaining indices. For example, contacting a
third-order system
on and leads to the sum
for each value of . The resulting values form a
first-order system , i.e.
For a fourth-order example,
contracting a system
on and yields the sums
which can be organized into a
second-order system ,
i.e.
Each of these examples illustrates
that contractions indeed reduce the order of the system by .
Although it will be shortly supplanted by Einstein's famous summation convention, the
summation sign can be effectively used to represent contractions. For example, the equation
can be written more compactly in the
form
where we used as the upper limit for the sake of greater generality.
Also, the choice to use the letter for both indices is arbitrary. We could just as well use
the letter , i.e.
or any other letter for that matter
-- other than , of course. Regardless, a contraction
always involves one letter appearing twice, once as a superscript and once as subscript. The
letter that represents the contracted indices and therefore appears twice is known as the
repeated index.
7.7.1Contraction as a guide towards geometrically meaningful objects
The opposite-flavors requirement for a valid contraction is the cornerstone of the tensor
notation. The fundamental reason for this requirement, which has to do with the tensor property,
will have to wait until Chapter 14. On the other
hand, the profound impact of this feature on our analytical framework will become apparent very
quickly.
Recall that systems of order zero represent the ultimate goal of every analysis. Therefore,
contraction, being the only operation that reduces the order of a system, must play a
crucial role in producing systems of order zero. Furthermore, the opposite-flavors rule implies
that every analysis must perfectly balance the number of covariant and contravariant
indices. Imagine, for example, that we encountered the combination in
the course of our analysis. Then, in order to eventually arrive at a geometrically meaningful
result, the analysis must also include an additional ingredient -- with superscripts. This
insight demonstrates that the opposite-flavors restriction greatly limits the totality of feasible
combinations. Like all constructive constraints, it serves to sharpen our logic and to guide our
explorations. While in other creative arenas, artists impose constraints upon themselves, Tensor
Calculus and its notational system provide us with one from the very get-go.
We would also be quite remiss not to point out one more remarkable aspect of the guiding nature of
the tensor notation. Experience shows that not only does the tensor notation limit the totality of
feasible combinations, but also every combination that is feasible is meaningful. We
will see this phenomenon time and again throughout our narrative. For example, consider the
Riemann-Christoffel tensor
which plays in important role in Riemannian spaces. It has a mismatched number of superscripts and
subscripts and therefore cannot validly produce a system of order zero. In order to balance the
indices, let us use the aforementioned operation of index juggling to convert the second subscript
into a superscript in order to produce the new system .
Finally, let us contract the resulting system on and , i.e. calculate the sum
for each and , and subsequently contract the result
on and to produce a system of order zero
, i.e.
We have never stated our rationale
for these operations and perhaps we performed them simply because we could. Well, it turns out that
the resulting quantity , known as the scalar
curvature, plays an important role in General Relativity.
If the foregoing discussion makes it seem that the tensor notation does all the work for us, rest
assured that this is not the case. Tensor Calculus helps us express our ideas, and occasionally to
guide them, but does not tell us which ideas to explore. Albert Einstein, an avid practitioner of
Tensor Calculus, wrote to a friend while developing General Relativity: I have been laboring
inhumanly. I am quite overworked. Perhaps in the absence of Tensor Calculus, he would have
given up.
7.7.2Repeated indices
} } } As we have already mentioned, a contraction expressed with the help of the summation symbol,
e.g.
always features a pair of indices
denoted by the same letter. It is called the repeated index, where the word index
refers to the repeated letter.
The alternative terms are dummy index and, less commonly, summation index or
contraction index. The word dummy makes sense because the repeated letter can
be replaced with any other (that is not used for some other purpose) without altering the meaning
of the expression. For example, when the letter is replaced, say, with , the summation
becomes
The equivalence of the two
expressions can be seen by unpacking, i.e.
7.7.3Free indices
} The remaining indices, i.e. those not participating in a contraction, are called free or
live indices. In a tensor expression, all free indices must be in perfect correspondence
among all constituent parts of the expression. For example, consider the identity
The indices , , and are free and must therefore be
present, with a consistent placement, in each term.
Free indices enumerate the elements of the resulting system. For example,
is a third-order system enumerated by , , and . In a tensor equation, free indices
enumerate the individual identities. For example, the equation
represents
independent identities. One of those identities, i.e. the one corresponding to , , and , reads
When fully unpacked, this identity
becomes
This example illustrates the difference between how repeated and free indices are unpacked. When a
repeated index is unpacked, a contraction is expanded into an literal sum. When a
free index is unpacked, a single expression (or equation) is replaced with a collection of
expressions (or equations).
Much like dummy indices, free indices can be renamed as long as the name change is consistent
across all terms. For example, in the equation
the index can be replaced with , i.e.
In a more subtle maneuver, the
letters and can trade places, i.e.
This, too, leaves the meaning of the
equation unchanged, as can be confirmed by unpacking.
Note that since the last three equations are equivalent, we would ordinarily assign to them the
same equation number. However, the purpose of these particular equations was specifically to call
attention to the different combinations of letters used for indices and thus assigning them
distinct equation numbers make more sense.
7.7.4Contraction in combination with multiplication
Contraction often occurs in combination with multiplication. In The Absolute Differential
Calculus, Levi-Civita refers to this combined operation as composition or inner
multiplication of tensors. }
As a matter of fact, multiplication is almost always followed by contraction. For example, consider
two first-order systems and
.
Their product
is a mixed second-order system and can therefore be contracted to produce the number
You can probably already guess that
this operation is in some way related to the dot product. This is indeed so and the dot product is,
in fact, one of the underlying reasons for the existence of the operation of contraction.
For another important example, consider two second-order systems
and .
Their product
is a fourth-order system. Let us contract this product on the covariant index of
and the contravariant index of ,
i.e.
(This is the first example of the
flexible renaming of indices we mentioned earlier: we introduced a system as
and immediately switched to
in the very next line. Despite the distinct indicial signatures,
and
refer to the very same system.) This contraction is, of course, related to matrix multiplication.
If the elements of
are organized into a matrix and those of
are organized into a matrix then the above contraction may be
interpreted as the product , provided
that for both systems the contravariant index is considered first and the covariant second.
7.7.5The Kronecker delta in a contraction
Recall that, in matrix terms, the Kronecker delta
corresponds to the identity matrix. Therefore, we might expect that the effect of contracting it
with another system is to leave that system unchanged. This is indeed the case.
Consider a first-order system and
contract it with the Kronecker delta .
The result is the expression
with a dummy index and a free index . Let us analyze this expression by
unpacking it, i.e.
Observe that for each value of , precisely one of the three terms on
the right survives thanks to the special values of the Kronecker delta. For the result is , for
the result is , and
for the result is . In
other words, for each the result is .
In other words,
This identity confirms that
contraction with the Kronecker delta leaves the system unchanged, since and
represent the same object. In terms of index manipulation, the Kronecker delta may be thought of as
an index renamer: when contracted with , the
Kronecker delta
simply replaces the letter with .
7.8The summation convention
7.8.1A description of the convention
} } The summation convention, also known as Einstein's summation convention or
Einstein's notation, was introduced by Albert Einstein in his celebrated 1916 work The
Foundation of the Theory of General Relativity. Einstein observed that in virtually all
scenarios, a repeated index emerges only in the context of a valid contraction. He therefore
proposed to use a repeated index to signal a contraction, thus eliminating the need for the
summation sign. For example,
and
In Einstein's own words: A glance at the equations of this paragraph shows that there is always
a summation with respect to the indices which occur twice under a summation sign [...], and only
with respect to indices which occur twice. It is therefore possible, without loss of clarity, to
omit the sign of summation. In its place we introduce the convention: If an index occurs twice in
one term of an expression, it is always to be summed unless the contrary is expressly stated.
Being essentially an abbreviation, the summation convention may seem like a minor notational
convenience. The truth, however, is that it is surprisingly deep and its remarkable psychological
and aesthetic effect on Tensor Calculus cannot be overstated. Consider the expression
which, as we will later discover,
represents the signed volume of a parallelepiped formed by three vectors. This example shows how,
in the absence of the summation convention, an indicial expression -- or, even more so, a string of
indicial identities -- can become dominated by summation signs to the point that the essential
algebraic details are obscured. With the help of the summation convention, the same expression
reads
Thus, what is in actuality a sum
of products, i.e.
appears as a single product.
Thus, at first, the summation convention may seem like an oversimplification that is likely to lead
to errors. However, as we will discover shortly, it is algebraically sound to treat sums of
products as simple products in a wide range of circumstances. Thus, the summation convention is not
an oversimplification but, instead, a valid, albeit dramatic, simplification that serves to
bring out the essence of algebraic ideas.
7.8.2Simultaneous contractions
The summation convention introduces potential ambiguities with respect to the order of contractions
when more than one contraction is present in a single expression. Fortunately, all ambiguities
prove to be immaterial as we are about to demonstrate.
Let us begin with the expression
featuring what appears to be a sum
of two contractions, i.e.
According to this interpretation,
translates to the sum
where the parentheses were strictly
for visual grouping. However, an alternative interpretation of
is as a single contraction, i.e.
which equals the sum
where, once again, the parentheses
are present only for convenience. Since the two resulting sums produce identical results, the two
interpretations are equivalent. Note that we will always prefer the former, more natural,
interpretation.
For a second example, consider the expression
This expression can also be
interpreted in two different ways depending on the order of summations, i.e.
The fact that these expressions are
equivalent can be easily confirmed by unpacking, as both expressions yield the sum of the nine
terms ,
,
,
,
,
,
,
, and
.
Far more subtle is the deceptively simple expression
The two contractions in this
expression are completely independent of each other and it clearly does not matter which
contraction is performed first. Thus, this expression represents a different kind of ambiguity, as
it is unclear whether the contractions are performed before or after the multiplication. If the
contractions are carried out first, the resulting unpacked expression is
On the other hand, if the tensor
product is carried out first and the intermediate expression (a
fourth-order system with elements) is subsequently contracted on and as well as and , the result is the sum of nine terms, i.e.
As we can see, the two
interpretations are equivalent, thanks to (a non-trivial application of) the distributive law.
I hope that these examples give insight into the multitude of issues that Einstein had to give
careful consideration to before proposing his convention. In general, experience shows that
ambiguities that arise as a result of the summation convention almost never lead to contradictions.
We will now discuss some of the minor and easy-to-avoid exceptions to this general rule.
7.8.3Additional nuances of the summation convention
As we described above, the summation convention makes sums of products appear simply as
products. Case in point is the classical contraction
which looks like a product
but it is, in fact, a sum of three products, i.e.
As a result, we must be cognizant of
the unpacked form of the expression especially when it appears as an argument to a nonlinear
operator.
For example, a formal application of the power of a product rule to the expression
, i.e.
the square of the quantity ,
yields
which is clearly false. To see this,
simply unpack both sides, i.e.
to observe the falsehood. Similarly,
the identity
is false.
On the other hand, when a contraction is subject to a linear operation, the formal treatment
of a sum of products as if it were a simple product is valid. For example, imagine that the systems
and
are
functions of a parameter and let
Then a formal application of the
product rule yields the identity
Note that this identity is correct,
as can be easily verified by unpacking the right side. This is left as an exercise where you will
observe that the key to the correctness of the above identity is the linear property of the
derivative, i.e.
We will find that in the overwhelming majority of situations that arise in Tensor Calculus, the
summation convention, which treats sums of products as single products, is safe. When in doubt,
however, we can always use the technique of unpacking in order to preclude mistakes.
7.8.4The identical meanings of contract with and multiply by
The phrases contract with and multiply by may be both used to describe the operation
of multiplication followed by contraction. For example, consider the expression
When we say contract
with , we
imply multiplying , i.e.
and subsequently contracting on
and , i.e.
Alternatively, we may say
multiply
by to
describe the exact same operation since, in the presence of the summation convention, the repeated
index signals the subsequent contraction.
Since the order of multiplicative terms in a tensor product is irrelevant, we need not pay
attention to the order of the terms when an expression is contracted with or multiplied
by another system. For example, the result of contracting the identity
with can
also be written as
In the next Section, we will say a
few more words about the order of multiplicative terms and the apparent conflict with the fact that
matrix multiplication is noncommutative.
7.9The order of the multiplicative terms in a contraction is immaterial
Consider two second-order systems
and ,
where we have used the dot placeholder to indicate that in both systems the superscript is the
first index and the subscript is the second. As we have noted above, the contraction
corresponds to the product of
matrices associated with the two systems. Specifically, if the result of the above contraction is
denoted by ,
i.e.
and , , and are the matrices associated with
,
,
and ,
then
It is assumed that the reader is
sufficiently proficient in matrix multiplication to recognize the clear correspondence between the
last two equations. However, despite how clear this correspondence may be, it raises an interesting
question.
As we well know, matrix multiplication is generally not commutative, i.e.
and, therefore,
Meanwhile, the indicial expressions
are completely equivalent. After
all, the symbols
and
denote the individual elements of the respective systems and, being elementary numbers, clearly
commute. Thus,
Note that the fact that
multiplication is followed by contraction does not change anything. If you still have any doubt,
then unpacking the contractions should remove all doubt. Indeed,
Therefore, the identity
can be equivalently rewritten with
the multiplicative terms on the right in the opposite order, i.e.
So, is there a contradiction between the fact that
and the fact that
Naturally, there is not. The
key lies in how the underlying elementary arithmetic operations are encoded in each notational
system. In a matrix product , the matrices
and play unequal roles: the columns of
are the linear combinations of the
columns of with the coefficients supplied by the columns of . In the product ,
on the other hand, the objects
and
represent individual and may therefore appear to have equal roles. However, as systems,
and
do not participate equally in the contraction since the summation takes place on the second
index of
and the first index of .
In other words, the underlying elementary arithmetic operations encoded by the order of terms in a
matrix product, are achieved by the interplay among the indices, rather that the order of the
multiplication terms, in an indicial expression.
Interestingly, the tensor notation proves to be more economical than the language of Matrix Algebra
in terms of the number of primary operations and rules. As we just observed, one need not pay
attention to the order of the multiplicative terms in a product. Furthermore, we will discover that
the concept of the transpose can also be captured by the interplay among the indices in a
contraction. As we have already noted, the trace can also be represented by a contraction.
As a result, the operator set in the tensor notation is surprisingly small as it consists only of
addition, multiplication, and contraction. This aspect of the tensor notation is appealing, but it
would be a mistake to conclude that the tensor notation is "better" than the language of Matrix
Algebra in some absolute sense. Each system has areas where it offers advantages over the other.
Generally speaking, the tensor notation is superior when access to individual elements of systems
is required. The language of Matrix Algebra is superior when the focus is on the operators as a
whole and their algebraic properties. These issues will be explored in greater detail in Chapter 19.
7.10Invalid tensor expressions
There are several reasons why an indicial expression may be considered invalid. First, it may
include incompatible indicial signatures, as in the sum
Second, it may be ambiguous, such as the "contraction"
Such an expression may arise as a
result of multiplying the expressions
while forgetting to change the name
of the repeated index in one of them. The proper way to express the result of multiplying by
is,
of course,
Similarly, in order to express the
expression as a
product of with
itself, a new index needs to be introduced, i.e.
Having to rename some of the indices
in preparation for combining expressions in a product is very common in Tensor Calculus.
Third, a combination may be invalid simply because it can never arise in the course of a legitimate
analysis, such as the sum
On a technical level, this
expression can be interpreted as a second-order system consisting of
pairwise sums of the individual elements of and
.
Nevertheless, this does not change the fact that we will simply never encounter such an expression.
The final reason, which may appear arbitrary at this point but is actually at the very heart of
Tensor Calculus, is that some operations do not preserve the tensor property. Expressions that
violate this rule include the addition of systems with mismatched indicial signatures, such as
Another example is a forced
summation on two like-flavored indices, such as
or
The invalidity of the last two
expressions maybe surprising to some readers since the former corresponds to the trace of a matrix
while the latter corresponds to the dot product with respect to an orthonormal basis -- two very
common and meaningful operations. Nevertheless, these expressions are indeed invalid in the context
of Tensor Calculus which holds expressions to the higher standard of geometric meaningfulness. As
we will discover, operations that do not preserve the tensor property, do not produce geometrically
meaningful objects and therefore do not meet this standard.
This completes our discussion of the basic elements of the tensor notation. However, before we
return to the study of differential objects in Euclidean spaces, it will behoove us to dwell a
little longer on the notation itself. To this end, the next Chapter will describe a few elementary
applications that illustrate the power of the tensor notation. In doing so, the next Chapter will
not only serve to further your familiarity with the notation but will also introduce a number of
important techniques directly relevant to the discussions that follow.
7.11A brief historical note
The evolution of the tensor notation continued long after the main ideas of the subject had already
been formulated. For example, the use of superscripts to indicate the manner in which objects
transform under a change of coordinates is already present in Gregorio Ricci and Tullio
Levi-Civita's original 1901 paper M\'{ethodes de calcul diff'{e}rentiel absolu et leurs
applications}. However, the use of superscripts was somewhat tentative as they were always found in
parentheses, e.g. the symbol
in the M\'{ethodes} equation (7) on page 151, so as to not confuse them with exponentiation.
Furthermore, the coordinates themselves are notated with subscripts. Even in 1925, in his The
Absolute Differential Calculus, Levi-Civita writes The indices of contravariance are
generally written above, those of covariance below; an exception is however made for the variables
, which are as usual denoted by , ,
with the indices below... Albert Einstein, in The Foundation of the Theory of General
Relativity, also enumerates coordinates with subscripts although his use of superscripts is
otherwise the same as ours.
This is only to say that scientific ideas and the languages that describe them tend to be developed
in an interdependent manner where, at times, the ideas determine the language and, at other times,
the language inspires new ideas.
7.12Exercises
Exercise 7.1List the twelve elements of the system represented by the symbol .
Exercise 7.2Capture the sum
in the tensor notation.
Exercise 7.3For a fourth-order system , do the contractions
and
yield the same result? In order to answer this question, fully unpack each expression in a three-dimensional space.
Exercise 7.4Show by unpacking that the identity
implies that corresponds to a skew-symmetric matrix.
Exercise 7.5In an -dimensional space, how many relationships does the equation
represent? Show that from this equation it follows that all elements of are zero.
Exercise 7.6Show by a manipulation of indices that the expressions and are equivalent.
Exercise 7.7For a system that corresponds to the matrix
calculate the quantities
Exercise 7.8In a three-dimensional space, i.e. , fully unpack the equation
Exercise 7.9In a three-dimensional space, fully unpack the expression . For example, the element corresponding to the values and of the live indices is
Exercise 7.10For two functions and , demonstrate the identity
from Section 7.8.3. Recall that this identity shows that the product rule can be applied to a contraction of a product as if it were a simple product.
Exercise 7.11Show that in an -dimensional space,
Exercise 7.12Show that
Exercise 7.13Show that
Exercise 7.14Show that
Exercise 7.15Show that
Exercise 7.16Show that in the -dimensional space,
Exercise 7.17Show that in the -dimensional space,
Exercise 7.18Show that
Exercise 7.19Show that the system
is symmetric, i.e.
Exercise 7.20Show that the system
is skew-symmetric, i.e.
Exercise 7.21If is skew-symmetric, i.e.
show that
Exercise 7.22Conversely, show that if the latter identity holds for any , then is skew-symmetric.
Problem 7.1Show that a symmetric system , i.e.
in an -dimensional space has at most distinct elements. The maximal number of distinct elements in a system is known as the number of degrees of freedom. Although it is not central to our narrative, let us generalize this calculation to systems of arbitrary order. Suppose that is a symmetric third-order system. In other words, any two elements of related by a permutation of indices are equal. For example, . Show that has
degrees of freedom. In general, show that a symmetric system of order has
degrees of freedom.