19. Linear Algebra, Matrices, and the Tensor Notation

This Chapter connects some of the essential ideas in Linear Algebra with concepts in Tensor Calculus and, in turn, discusses how to express indicial equations involving first- and second-order systems in matrix terms. This Chapter may present a challenging read as it attempts to answer questions that the reader may not have asked. After all, as we will state early in this Chapter, Tensor Calculus does not need matrices. As a matter of fact, Linear Algebra and matrices benefit from the ideas of Tensor Calculus more than the other way around. Nevertheless, I hope that the reader gives this Chapter a thorough read at this time and returns to it later when the questions discussed here arise naturally at some point in the future.

19.1Preliminary remarks

19.1.1Sidestepping matrices

Throughout our narrative, we have tried, as much as possible, to avoid the language of matrices. This was, in part, to show that the structures that present themselves in the course of a tensorial analysis can be handled without the use of matrices. Scalar variants of any order can be thought of simply as indexed lists of numbers that need not be organized into tables. For example, it has been sufficient to summarize the

27

elements of the Christoffel symbol

\Gamma_{jk}^{i}

in cylindrical coordinates by listing its three nonzero elements

\Gamma_{22}^{1}=-r\text{\ \ \ \ and\ \ \ \ }\Gamma_{12}^{2}=\Gamma_{21} ^{2}=\frac{1}{r}. \tag{12.52}

This economic way of capturing

\Gamma_{jk}^{i}

proved quite effective in helping us interpret the formula for the Laplacian

\nabla_{i}\nabla^{i}U=Z^{ij}\left( \frac{\partial^{2}U}{\partial Z^{i}\partial Z^{j}}-\Gamma_{ij}^{m}\frac{\partial U}{\partial Z_{m}}\right) \tag{18.57}

in cylindrical coordinates, where

\nabla_{i}\nabla^{i}U=\frac{1}{r}\frac{\partial}{\partial r}\left( r\frac{\partial U}{\partial r}\right) +\frac{1}{r^{2}}\frac{\partial^{2} U}{\partial\theta^{2}}+\frac{\partial^{2}U}{\partial z^{2}}.\tag{19.1}

That calculation was a microcosm of the general realization that Tensor Calculus does not need Matrix Algebra.

19.1.2The advantages of matrices

Yet, a few of the distinct advantages of Matrix Algebra are impossible to ignore. First, the Matrix Algebra treatment of systems as whole indivisible units helps stimulate our algebraic intuition which is deeply ingrained in our mathematical culture. Consider, for example, the concept of the matrix inverse and the identity for the inverse of the product of two matrices:

\left( AB\right) ^{-1}=B^{-1}A^{-1}.\tag{19.2}

This identity is a concise expression of the fundamental idea that to reverse a combination of two actions, we must combine the individual reverse actions in the opposite order. For example, if our trip to work consists of traveling by train followed by traveling by bus then the trip home will require traveling back by bus followed by traveling back by train. The indicial notation does not have an effective way of capturing this elementary idea.

To prove the formula

\left( AB\right) ^{-1}=B^{-1}A^{-1}, \tag{19.2}

simply multiply

B^{-1}A^{-1}

AB

. We have

B^{-1}A^{-1}AB=B^{-1}\left( A^{-1}A\right) B=B^{-1}IB=B^{-1}B=I,\tag{19.3}

as we set out to show. In this concise derivation, the great utility of treating matrices as indivisible whole units is on full display. Furthermore, the formula

\left( AB\right) ^{-1}=B^{-1}A^{-1} \tag{19.2}

can be used to derive the analogous equation for the inverse

\left( ABC\right) ^{-1}

by the following chain of identities

\left( ABC\right) ^{-1}=\left( \left( AB\right) C\right) ^{-1} =C^{-1}\left( AB\right) ^{-1}=C^{-1}\left( B^{-1}A^{-1}\right) =C^{-1}B^{-1}A^{-1},\tag{19.4}

where the justification of each step is left as an exercise. In summary,

\left( ABC\right) ^{-1}=C^{-1}B^{-1}A^{-1}.\tag{19.5}

From this derivation, it is clear that the inverse of the product of any number of matrices equals the product of the individual inverses in the opposite order. This derivation is not only straightforward, but also highly insightful from the algebraic point of view. Meanwhile, the same set of ideas cannot be effectively expressed in the indicial notation.

We should also call attention to another purely algebraic fact which we have relied upon in our narrative to a significant degree. Namely, it is the fact that

AB=I\text{ \ \ \ implies\ \ \ \ }BA=I.\tag{19.6}

It is because of this fact that we have been able to contract the product of the covariant metric tensor

Z_{ij}

and the contravariant metric tensor

Z^{ij}

on any valid combination of indices to produce the Kronecker delta.

Let us give the classical matrix-based proof of this fact which will once again demonstrate the utility of the algebraic way of thinking afforded to us by matrices. It follows from Linear Algebra considerations that if

A

is a square matrix with linearly independent columns then there exists a unique matrix

R

such that

AR=I.\tag{19.7}

We will refer to

R

as the "right inverse" of

A

. A square matrix with linearly independent columns has linearly independent rows and therefore there exists a unique matrix

L

, referred to as the left inverse of

A

, such that

LA=I.\tag{19.8}

Our goal is to show that the left inverse and the right inverse are the same matrix, i.e.

L=R.\tag{19.9}

Subsequently, if we denote the common value of

L

and

R

B

, then the equivalence of

AR

and

LA

(both being the identity matrix) becomes the equivalence of

AB

and

BA

To show that

L=R, \tag{19.9}

multiply both sides of the identity

LA=I \tag{19.8}

R

on the right, i.e.

LAR=IR.\tag{19.10}

Since

AR=I

LI=L

, and

IR=R

, we find

L=R \tag{19.9}

as we set out to show. Note that we glossed over an application of the associative property of matrix multiplication. Indeed, multiplying

LA

R

yields

\left( LA\right) R

. By the associative property, we find that

\left( LA\right) R=L\left( AR\right)

and the rest of the argument can proceed as described above. Thus, the equivalence of the left and right inverses is a direct consequence of the associative property of multiplication.

Much like the preceding proofs, this proof, too, cannot be effectively carried out in indicial notation. In fact, one could argue that this fundamental property of matrices cannot be discovered in indicial notation. This proof completes our discussion of the significant logical advantages of the matrix notation.

Another advantage of the matrix notation is that there are numerous software packages that implement various matrix operations. Consequently, any calculation formulated in terms of matrices, matrix products, and even more advanced matrix operations, can then be very effectively carried out by computers. Most readers can probably hardly remember the last time they multiplied two matrices by hand.

Finally, we must acknowledge the ubiquity of the language of matrices. Matrices represent our go-to way of visualizing data, economizing notation, and injecting algebraic structure into otherwise unstructured frameworks in a way that enables the leveraging of ideas from Linear Algebra.

Tensor Calculus which, as we stated, does not need matrices, is a case in point. After all, it is practically impossible not to think of first- and second-order systems as matrices. We have consistently associated metric tensors with matrices, even though we preferred the phrase corresponds to to the word equal in order to preserve the logical separation between systems and matrices. For example, we would say that, in spherical coordinates,

Z_{ij}\text{ \textit{corresponds} to }\left[ \begin{array} {ccc} 1 & & \\ & r^{2} & \\ & & r^{2}\sin^{2}\theta \end{array} \right] . \tag{9.45}

Consequently, in spherical coordinates, the dot product

\mathbf{U} \cdot\mathbf{V}

of vectors

\mathbf{U}

and

\mathbf{V}

with components

U^{i}

and

V^{i}

, given by the tensor identity

\mathbf{U}\cdot\mathbf{V}=Z_{ij}U^{i}V^{j}, \tag{10.25}

can be expressed in the language of matrices as follows:

\mathbf{U}\cdot\mathbf{V}= \begin{array} {c} \left[ \begin{array} {ccc} U^{1} & U^{2} & U^{3} \end{array} \right] \\ \\ \end{array} \left[ \begin{array} {ccc} 1 & & \\ & r^{2} & \\ & & r^{2}\sin^{2}\theta \end{array} \right] \left[ \begin{array} {c} V^{1}\\ V^{2}\\ V^{3} \end{array} \right] .\tag{19.11}

This form was frequently mentioned in Chapter 10. To many readers it likely helped clarify the interpretation of the then-novel expression

Z_{ij}U^{i}V^{j}

19.1.3The advantages of the tensor notation

Having just described the indisputable benefits of the language of matrices, we ought to reiterate some of the advantages of the tensor notation.

First, by virtue of treating systems not as whole indivisible units but as collections of individual elements, the tensor notation allows direct access to those elements. We have taken great advantage of this feature on numerous occasions, including the analysis of quadratic form minimization in Section 8.7 and in establishing the derivative of the volume element

\sqrt{Z}

with respect to the coordinate

Z^{i}

in Section 16.12.

Second, the tensor notation continues to work for systems of order greater than two. This is the case for a number of objects that we have already encountered, including the Christoffel symbol

\Gamma_{jk}^{i}

, the Levi-Civita symbols

\varepsilon_{ijk}

and

\varepsilon^{ijk}

, and the Riemann-Christoffel tensor

R_{klij}

. In order for an identity to be beyond the scope of the matrix notation, it need not be as complicated as the celebrated tensor equation

\nabla_{i}\nabla_{j}T^{k}-\nabla_{j}\nabla_{i}T^{k}=\left( \frac {\partial\Gamma_{jm}^{k}}{\partial Z^{i}}-\frac{\partial\Gamma_{im}^{k} }{\partial Z^{j}}+\Gamma_{in}^{k}\Gamma_{jm}^{n}-\Gamma_{jn}^{k}\Gamma _{im}^{n}\right) T^{m}. \tag{15.120}

Even an identity involving second-order systems may require the tensor notation. For example, the identity

a_{ir}a_{js}-a_{is}a_{jr}=A\varepsilon_{ij}\varepsilon_{rs} \tag{17.22}

requires the tensor notation since it has four live indices.

Last but not least, the tensor notation has its own unique way of inspiring algebraic intuition. In particular, the natural placement of indices always accurately predicts the way an object transforms under a change of coordinates and suggests the ways in which it can be meaningfully combined with other objects. This feature of the tensor notation will be showcased throughout this Chapter.

19.2Vectors, inner products, bases

In Section 2.7, we contrasted our approach to vectors to that often taken by Linear Algebra. According to our approach, vectors are directed segments, subject to addition according to the tip-to-tail rule and multiplication by numbers. It can be demonstrated that these operations satisfy a number of desirable properties, such as associativity and distributivity. In Linear Algebra, vectors are defined as generic objects subject to abstract operations of addition and multiplication by numbers that, by definition, satisfy the same set of desirable properties. That makes geometric vectors, i.e. directed segments, a special case of the abstract vector as defined by Linear Algebra. To distinguish between the two categories of vectors, we have used bold capital letters, such as

\mathbf{U}

\mathbf{V}

, and

\mathbf{W}

, for geometric vectors, and bold lowercase letters, such as

\mathbf{x}

\mathbf{y}

, and

\mathbf{z}

, for vectors in the sense of Linear Algebra.

Tensor Calculus and Linear Algebra take similarly different approaches to the related concepts of the dot product and inner product. The dot product of two geometric vectors

\mathbf{U}

and

\mathbf{V}

is defined by the equation

\mathbf{U}\cdot\mathbf{V}=\operatorname{len}\mathbf{U}\operatorname{len} \mathbf{V}\cos\gamma. \tag{2.14}

It can be demonstrated that the dot product satisfies commutativity,

\mathbf{U}\cdot\mathbf{V}=\mathbf{V}\cdot\mathbf{U.} \tag{2.15}

and distributivity

\mathbf{U}\cdot\left( \alpha\mathbf{V}+\beta\mathbf{W}\right) =\alpha \mathbf{U}\cdot\mathbf{V}+\beta\mathbf{U}\cdot\mathbf{W.} \tag{2.18}

The distributive property can also be referred to as linearity and is often described by stating that the dot product is linear in each argument. It is also obvious that the dot product of a nonzero vector

\mathbf{U}

with itself is positive, i.e.

\mathbf{U}\cdot\mathbf{U}\gt 0\text{, \ \ if\ \ \ }\mathbf{U}\neq\mathbf{0.}\tag{19.12}

This property, known as positive definiteness, is hardly worthy of mention for geometric vectors but becomes part of the definition for generic vectors.

The Linear Algebra concept of an inner product is an axiomatic adaptation of the classical dot product. An inner product is an operation that takes two vectors and produces a number. The inner product of vectors

\mathbf{x}

and

\mathbf{y}

is denoted by

\left( \mathbf{x} ,\mathbf{y}\right)

and is defined by commutativity

\left( \mathbf{x},\mathbf{y}\right) =\left( \mathbf{y},\mathbf{x}\right) , \tag{2.90}

distributivity, also known as linearity,

\left( \mathbf{x},\alpha\mathbf{y}+\beta\mathbf{z}\right) =\alpha\left( \mathbf{x},\mathbf{y}\right) +\beta\left( \mathbf{x},\mathbf{z}\right) , \tag{2.91}

and positive definiteness

\left( \mathbf{x},\mathbf{x}\right) \gt 0,\text{\ \ \ \ if\ \ \ \ } \mathbf{x}\neq\mathbf{0.} \tag{2.92}

A vector space endowed with an inner product is called a Euclidean space -- so close is the analogy with Euclidean spaces as we introduced them in Chapter 2.

The final difference between our approach to vectors and that of Linear Algebra is the way in which a basis emerges. In Tensor Calculus, a coordinate system is chosen arbitrarily and the covariant basis

\mathbf{Z}_{i}

is constructed by differentiating the position vector function

\mathbf{R}\left( Z\right)

with respect to the coordinate

Z^{i}

, i.e.

\mathbf{Z}_{i}=\frac{\partial\mathbf{R}}{\partial Z^{i}}. \tag{9.9}

Thus, the covariant basis (at a given point) is specific to the chosen coordinate system. The dependence of

\mathbf{Z}_{i}

on the choice of coordinates is underscored by the term variant. In Linear Algebra, a basis

\mathbf{b}_{i}

is arbitrarily selected. Any complete linearly independent set of vectors represents a legitimate basis. Thus, in both approaches we can talk about a change of basis, although in Tensor Calculus, a change of basis is induced by a change of coordinates.

Having contrasted the origins of the key concepts, we must now point out that their uses in the two subjects are almost identical. Tensor Calculus and Linear Algebra share the concept of the component space in which analysis is performed in terms of the components of vectors rather than vectors themselves. For example, with the help of the covariant metric tensor

Z_{ij}

defined by

Z_{ij}=\mathbf{Z}_{i}\cdot\mathbf{Z}_{j}, \tag{14.5}

the component space expression for the dot product

\mathbf{U}\cdot\mathbf{V}

of vectors

\mathbf{U}

and

\mathbf{V}

with components

U^{i}

and

V^{i}

reads

\mathbf{U}\cdot\mathbf{V}=Z_{ij}U^{i}V^{j}. \tag{10.25}

Similarly, the inner product matrix

M

, whose entries

M_{ij}

are

M_{ij}=\left( \mathbf{b}_{i}\mathbf{,b}_{j}\right) ,\tag{19.13}

can be used to express the inner product

\left( \mathbf{x,y}\right)

of vectors

\mathbf{x}

and

\mathbf{y}

with components

x^{i}

and

y^{i}

as follows:

\left( \mathbf{x,y}\right) =M_{ij}x^{i}y^{j}.\tag{19.14}

In the language of matrices, which is one of the topics discussed in this Chapter, the same operation is captured by the equation

\left( \mathbf{x,y}\right) =x^{T}My\tag{19.15}

where

x

and

y

are the column matrices consisting of the elements

x^{i}

and

y^{i}

The covariant metric tensor

Z_{ij}

and matrix

M

are essentially the same object, differing only in notation and the terminology used to describe them. We will therefore use

M

in many of the same ways that we have used

Z_{ij}

, including utilizing the symbol

M^{ij}

to denote the entries of

M^{-1}

, and performing index juggling by contracting with

M_{ij}

and

M^{ij}

19.3Correspondence between systems and matrices

As we have already mentioned, only first- and second-order systems can be effectively represented by matrices. As a matter of convention, let us agree to represent first-order systems by

n\times1

matrices. Thus, a system

x^{i}

in a three-dimensional space will be represented by the column matrix

\left[ \begin{array} {c} x^{1}\\ x^{2}\\ x^{3} \end{array} \right] .\tag{19.16}

We could have also represented

x^{i}

by a row matrix

\left[ x^{1}\ \ x^{2}\ \ x^{3}\right]

. However, limiting ourselves to column matrices only will help reduce the number of possible matrix representations of contractions.

The flavor of the index, whether it indicates a tensor property of the corresponding variant or is used for convenience, has no bearing on the matrix representation of a system. Thus, a system

x_{i}

is also represented by a column matrix, i.e.

\left[ \begin{array} {c} x_{1}\\ x_{2}\\ x_{3} \end{array} \right] .\tag{19.17}

Second-order systems correspond to square

n\times n

matrices. Note that in Chapter 8, we encountered systems that correspond to rectangular matrices. As a matter of fact, systems of that sort will begin to arise naturally when we study embedded surfaces in the next volume. Nevertheless, in this Chapter, we will limit our focus to square matrices. However, all points made in this Chapter will remain valid for systems corresponding to rectangular matrices.

As we have done throughout the book, and as the most commonly accepted convention dictates, the first index of the system corresponds to the row the element is in while the second index corresponds to the column. Thus, it is essential to have complete clarity as to the order of indices. For systems with two superscripts or two subscripts, the order is obvious. For mixed systems, i.e. systems with one superscript and one subscript, the order of the indices is usually indicated by the dot placeholder technique introduced in Section 7.2. For example, in the symbols

A_{\cdot j}^{i}\tag{19.18}

and

A_{i}^{\cdot j}\tag{19.19}

the dot makes it clear that

i

is the first index and

j

is second.

Interestingly, we have not seen the dot placeholder that much in our narrative so far. The reason for this is that the only mixed second-order system that we have consistently encountered is the Kronecker delta

\delta_{j}^{i}

which corresponds to the symmetric matrix

\left[ \begin{array} {ccc} 1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1 \end{array} \right] .\tag{19.20}

and, therefore, the order of the indices does not matter. However, as we will describe below in Section 19.5, this exception can be made only for symmetric systems, in the sense of the term symmetric that differs from that for matrices.

Another category of mixed second-order systems for which the use of the placeholder is not required is the Jacobians

J_{i^{\prime}}^{i}

and

J_{i}^{i^{\prime}}

, for which we can simply agree that the superscript is first and the subscript is second. What makes this convention reliable in this case is the fact that the indices of a Jacobian are never juggled. In other words, the symbol

J_{i^{\prime}}^{i}

is never used to represent the combination

J_{j}^{j^{\prime}}Z_{i^{\prime}j^{\prime}}Z^{ij}

. Had index juggling been allowed, the symbol

J_{i^{\prime}}^{i}

would be ambiguous and the use of the dot placeholder would be in order. More generally, the possibility of index juggling is the very reason for not being able to use the flavors of indices as a mechanism for determining their order.

19.4Matrix multiplication

Any combination of contractions involving first- and second-order systems can be represented by matrix multiplication. In this Section, we will review the basic mechanics of matrix multiplication and subsequently show how some of the most common contractions can be expressed by matrix products.

19.4.1The mechanics of matrix multiplication

Consider three matrices

A

B

, and

C

with entries

A_{ij}

B_{ij}

and

C_{ij}

. Suppose that

A

is an

l\times m

matrix,

B

m\times n

, and

C

l\times n

. Then, by definition,

C

is the product of

A

and

B

, i.e.

C=AB,\tag{19.21}

C_{ij}=\sum_{k=1}^{m}A_{ik}B_{kj}.\tag{19.22}

Note that we are following the Linear Algebra tradition of using only subscripts to reference the individual entries of a matrix.

An essential element of matrix multiplication is that the summation takes place over the second index of

A

and the first index of

B

. Since the first index indicates the row of the entry and the second indicates the column, matrix multiplication combines the

i

-th row of

A

with the

j

-th column of

B

to produce

C_{ij}

, i.e. the entry in the

i

-th row and

j

-th column of

C

. These mechanics are illustrated in the following figure:

\begin{array} {r} \begin{array} {r} j\hspace{0.4in}\\ \downarrow\hspace{0.4in} \end{array} \\ \begin{array} {c} \phantom{\square} \\ \phantom{\square} i\rightarrow\\ \phantom{\square} \\ \phantom{\square} \\ \phantom{\square} \end{array} \hspace{-0.1in}\left[ \begin{array} {ccc} \square & \square & \square\\ \square & C_{ij} & \square\\ \square & \square & \square\\ \square & \square & \square\\ \square & \square & \square \end{array} \right] \end{array} =\hspace{-0.1in} \begin{array} {r} \begin{array} {ccc} & & \\ & & \end{array} \\ \begin{array} {c} \phantom{\square} \\ \phantom{\square} i\rightarrow\\ \phantom{\square} \\ \phantom{\square} \\ \phantom{\square} \end{array} \hspace{-0.1in}\left[ \begin{array} {cccc} \square & \square & \square & \square\\ \blacksquare & A_{ik} & \blacksquare & \blacksquare\\ \square & \square & \square & \square\\ \square & \square & \square & \square\\ \square & \square & \square & \square \end{array} \right] \end{array} \hspace{-0.1in} \begin{array} {r} \begin{array} {r} j\hspace{0.42in}\\ \downarrow\hspace{0.42in} \end{array} \\ \left[ \begin{array} {ccc} \square & \blacksquare & \square\\ \square & B_{kj} & \square\\ \square & \blacksquare & \square\\ \square & \blacksquare & \square \end{array} \right] \\ \\ \end{array}\tag{19.23}

Thus, the mechanics of matrix multiplication are very rigid and it is up to us to use the two available "levers" -- the order of the operands and the transpose -- to make sure the products reflect the contractions in a given tensor expression. We will now go through a series of examples converting contractions to matrix products in increasing order of complexity.

19.4.2Contraction of first-order systems

Let us start with the contraction

x_{i}y^{i}\tag{19.24}

that represents the inner product of vectors

\mathbf{x}

and

\mathbf{y}

. Since

x_{i}

corresponds to

x=\left[ \begin{array} {c} x_{1}\\ x_{2}\\ x_{3} \end{array} \right]\tag{19.25}

and

y^{i}

corresponds to

y=\left[ \begin{array} {c} y^{1}\\ y^{2}\\ y^{3} \end{array} \right] ,\tag{19.26}

the combination

x_{i}y^{i}

corresponds either to

x^{T}y= \begin{array} {c} \left[ \begin{array} {ccc} x_{1} & x_{2} & x_{3} \end{array} \right] \\ \\ \end{array} \left[ \begin{array} {c} y^{1}\\ y^{2}\\ y^{3} \end{array} \right] \text{ \ \ \ \ or \ \ \ \ }y^{T}x= \begin{array} {c} \left[ \begin{array} {ccc} y^{1} & y^{2} & y^{3} \end{array} \right] \\ \\ \end{array} \left[ \begin{array} {c} x_{1}\\ x_{2}\\ x_{3} \end{array} \right] .\tag{19.27}

In either case, the matrix product involves the transpose of one of the matrices. We will find this to be the case in the more complicated situations as well.

Note that the tensor product

x_{i}y^{j}

corresponds to the matrix product

\text{\ }xy^{T}=\left[ \begin{array} {c} x_{1}\\ x_{2}\\ x_{3} \end{array} \right] \begin{array} {c} \left[ \begin{array} {ccc} y^{1} & y^{2} & y^{3} \end{array} \right] \\ \\ \end{array}\tag{19.28}

Thus,

x_{i}y^{i}

can also be captured evaluating the trace of the resulting matrix, i.e.

x_{i}y^{i}=\operatorname{trace}\left( \left[ \begin{array} {c} x_{1}\\ x_{2}\\ x_{3} \end{array} \right] \begin{array} {c} \left[ \begin{array} {ccc} y^{1} & y^{2} & y^{3} \end{array} \right] \\ \\ \end{array} \right) .\tag{19.29}

19.4.3Contraction of a second-order system with a first-order system

Let us now turn our attention to the contraction

A_{\cdot j}^{i}x^{j}\tag{19.30}

involving a second-order system

A_{\cdot j}^{i}

and a first-order system

x^{j}

. If

A

is the matrix corresponding to

A_{\cdot j}^{i}

and

x

is the matrix corresponding to

x^{i}

, then

A_{\cdot j}^{i}x^{j}\text{ corresponds to }Ax.\tag{19.31}

Note that the convention that the first index indicates the row while the second indicates the column is essential for reaching this conclusion. Also note that the alternative product

x^{T}A^{T}

results in the same values, although arranged into a

1\times n

row matrix. Therefore, by our convention, where first-order systems are represented by column matrices,

x^{T}A^{T}

does not properly represent the resulting system.

As we already mentioned, the flavor of the indices has no bearing on the matrix representation. Therefore, the combinations

A^{ij}x_{j}\text{, }A_{i}^{\cdot j}x_{j}\text{, and }A_{ij}x^{j}\tag{19.32}

which represent variations in the flavors of indices but not their order, are also represented by the product

Ax

The contraction

A_{\cdot j}^{i}x_{i},\tag{19.33}

on the other hand, is principally different in that it is the first index of

A_{\cdot j}^{i}

that is engaged in the contraction. As a result, the only way to represent

A_{\cdot j}^{i}x_{i}

by a matrix product is

A^{T}x.\tag{19.34}

As before, the product

x^{T}A

produces the same values as a row matrix and therefore does not properly represent the result.

Let us now express the transformation rules for first-order tensors in matrix form. Recall from Chapter 17, that we let

J\text{ denote the matrix corresponding to the Jacobian }J_{i^{\prime}}^{i}. \tag{17.9}

Now suppose that

T^{i}

is a contravariant tensor, i.e.

T^{i^{\prime}}=T^{i}J_{i}^{i^{\prime}}.\tag{19.35}

Since first-order tensors correspond to column matrices, we will use lowercase letters to denote those matrices. If the matrix

x

represents

T^{i}

and

x^{\prime}

represents

T^{i^{\prime}}

, then

x^{\prime}=J^{-1}x.\tag{19.36}

Similarly, if

T_{i}

is a covariant tensor, i.e.

T_{i^{\prime}}=T_{i}J_{i^{\prime}}^{i},\tag{19.37}

then its matrix representations

x

and

x^{\prime}

are related by the equation

x^{\prime}=Jx.\tag{19.38}

19.4.4Contraction of second-order systems

A contraction of two second-order systems produces another second-order system. Thus we expect that a tensor identity featuring such a contraction corresponds to one of the many variations of the matrix equation

C=AB\tag{19.39}

that differ by the order of

A

and

B

and their transposition. In fact, we can construct

16

different variations, such as

C=A^{T}B\text{, }C=AB^{T}\text{, }C^{T}=BA\text{, }C^{T}=BA^{T}\tag{19.40}

and so on. The fact that

\left( AB\right) ^{T}=B^{T}A^{T}\tag{19.41}

makes half of the

16

variations equivalent. For example,

C=AB^{T}\text{ \ and \ }C^{T}=BA^{T},\tag{19.42}

are equivalent as can be seen by taking the transpose of both sides of the first equation. This reduces the number of truly distinct equations to

8

and also means that every indicial relationship can be captured in two different, albeit equivalent, ways.

Recall, once again, that the flavors of indices have no bearing on the corresponding matrix representations. Thus, we will choose the placements of indices completely at will.

Consider three second-order systems represented by the matrices

A

B

, and

C

. For our first example, consider the relationship

C_{\cdot j}^{i}=A_{\cdot k}^{i}B_{\cdot j}^{k}.\tag{19.43}

This relationship very clearly corresponds to the matrix identity

C=AB.\tag{19.44}

However, if the order of the indices on

C_{\cdot j}^{i}

is switched,i.e.

C_{j}^{\cdot i}=A_{\cdot k}^{i}B_{\cdot j}^{k},\tag{19.45}

then the corresponding matrix identity is

C^{T}=AB\tag{19.46}

or, equivalently,

C=B^{T}A^{T}.\tag{19.47}

As the above two examples illustrate, the question of representing the contraction

A_{\cdot k}^{i}B_{\cdot j}^{k}

by a matrix product has two legitimate answers:

AB

and

B^{T}A^{T}

, even though these expressions result in two different matrices. The two different interpretations are possible because the combination

A_{\cdot k}^{i}B_{\cdot j}^{k}

, on its own, does not give us an indication of the order of the indices in the resulting system. If, in the result,

i

is the first index and

j

is the second, then

A_{\cdot k}^{i}B_{\cdot j}^{k}

corresponds to

AB

. Otherwise,

A_{\cdot k}^{i}B_{\cdot j}^{k}

corresponds to

B^{T}A^{T}

For another example, consider the equation

C_{\cdot j}^{i}=A_{\cdot k}^{i}B_{j}^{\cdot k}.\tag{19.48}

This contraction takes place on the second index of

A_{\cdot k}^{i}

and the second index of

B_{j}^{\cdot k}

. The entries in the matrices

A

and

B

involved in the contraction for

i=2

and

j=3

are illustrated in the following figure:

\left[ \begin{array} {cccc} \square & \square & \square & \square\\ \square & \square & \blacksquare & \square\\ \square & \square & \square & \square\\ \square & \square & \square & \square \end{array} \right] \ \ \ \ \ \ \ \left[ \begin{array} {cccc} \square & \square & \square & \square\\ \blacksquare & \blacksquare & \blacksquare & \blacksquare\\ \square & \square & \square & \square\\ \square & \square & \square & \square \end{array} \right] \left[ \begin{array} {cccc} \square & \square & \square & \square\\ \square & \square & \square & \square\\ \blacksquare & \blacksquare & \blacksquare & \blacksquare\\ \square & \square & \square & \square \end{array} \right] .\tag{19.49}

Clearly, this arrangement does not correspond to a valid matrix product. In order to invoke matrix multiplication, the matrix

B

needs to be transposed. Thus,

C_{\cdot j}^{i}=A_{\cdot k}^{i}B_{j}^{\cdot k} \tag{19.48}

corresponds to

C=AB^{T}.\tag{19.50}

Finally, consider the identity

C_{j}^{\cdot i}=A_{k}^{\cdot i}B_{\cdot j}^{k}.\tag{19.51}

The summation on the right takes place on the first index of

A_{k}^{\cdot i}

and the first index

B_{\cdot j}^{k}

, which suggests the combination

A^{T}B

. Additionally, as indicated by the symbol

C_{j}^{\cdot i}

, the result of the contraction must be arranged in such a way that

j

is the first index and

i

is the second. Thus, the matrix form of the identity must use

C^{T}

instead of

C

. In summary, the above relationship corresponds to the matrix identity

C^{T}=A^{T}B\tag{19.52}

or, equivalently,

C=B^{T}A.\tag{19.53}

Let us now express the transformation rules for second-order tensors in matrix form. Suppose that

T^{ij}

T_{\cdot j}^{i}

, and

T_{ij}

are tensors of the type indicated by their indicial signatures, i.e.

\begin{aligned}T^{i^{\prime}j^{\prime}} & =T^{ij}J_{i}^{i^{\prime}}J_{j}^{j^{\prime}}\ \ \ \ \ \ \ \ \ \ \left(19.54\right)\\T_{\cdot j^{\prime}}^{i^{\prime}} & =T_{\cdot j}^{i}J_{i}^{i^{\prime} }J_{j^{\prime}}^{j}\ \ \ \ \ \ \ \ \ \ \left(19.55\right)\\T_{i^{\prime}j^{\prime}} & =T_{ij}J_{i^{\prime}}^{i}J_{j^{\prime}}^{j}.\ \ \ \ \ \ \ \ \ \ \left(19.56\right)\end{aligned}

If the symbols

T

and

T^{\prime}

represent the matrices corresponding to the alternative manifestations of each of the three tensors, then the transformation rules in the matrix notation read

\begin{aligned}T^{\prime} & =J^{-1}T\left( J^{-1}\right) ^{T}\ \ \ \ \ \ \ \ \ \ \left(19.57\right)\\T^{\prime} & =J^{-1}TJ\ \ \ \ \ \ \ \ \ \ \left(19.58\right)\\T^{\prime} & =J^{T}TJ.\ \ \ \ \ \ \ \ \ \ \left(19.59\right)\end{aligned}

19.5Symmetric systems

Two systems

A_{ij}

and

B_{ij}

, related by the identity

B_{ij}=A_{ji},\tag{19.60}

correspond to matrices that are the transposes of each other. For example, if

A_{ij}\text{ corresponds to }\left[ \begin{array} {ccc} 1 & 4 & 7\\ 2 & 5 & 8\\ 3 & 6 & 9 \end{array} \right] ,\tag{19.61}

then

B_{ij}\text{ corresponds to }\left[ \begin{array} {ccc} 1 & 2 & 3\\ 4 & 5 & 6\\ 7 & 8 & 9 \end{array} \right] .\tag{19.62}

The best way to convince yourself of this relationship is to recall the technique of unpacking described in Section 7.4. For example,

B_{32}

equals

A_{23}

which equals

8

. In other words, the entry in the

3^{\text{rd}}

row and

2^{\text{nd}}

column of the matrix corresponding to

B_{ij}

8

-- the same as the entry in the

2^{\text{nd}}

row and

3^{\text{rd}}

column of the matrix corresponding to

A_{ij}

. Thus, the two matrices are, indeed, the transposes of each other.

The same statement holds for systems

A^{ij}

and

B^{ij}

with superscripts. If

B^{ij}=A^{ji}\tag{19.63}

then the two systems correspond to matrices that are the transposes of each other.

Similarly, two mixed systems

A_{\cdot j}^{i}

and

B_{i}^{\cdot j}

related by the identity

B_{i}^{\cdot j}=A_{\cdot i}^{j}\tag{19.64}

correspond to matrices that are the transposes of each other. It is noteworthy, however, that in order for this relationship to hold, the order in which the superscript and the subscript appear in the two systems must be reversed.

A system

A_{ij}

is called symmetric if it satisfies the identity

A_{ij}=A_{ji}.\tag{19.65}

Note that a symmetric system

A_{ij}

corresponds to a symmetric matrix. Recall that a matrix

A

is symmetric if it equals its transpose, i.e.

A^{T}=A.\tag{19.66}

The term symmetric makes sense since the values of the entries exhibits a mirror symmetry with respect to the main diagonal, e.g.

\left[ \begin{array} {rrr} 5 & 1 & 0\\ 1 & 4 & 2\\ 0 & 2 & 3 \end{array} \right] .\tag{19.67}

Similarly, a system

A^{ij}

with two superscripts is called symmetric if

A^{ij}=A^{ji}.\tag{19.68}

Such a system, too, corresponds to a symmetric matrix.

Interestingly, the concept of symmetry appears to be problematic for a mixed system

A_{\cdot j}^{i}

. The would-be definition of symmetry

A_{\cdot j}^{i}=A_{\cdot i}^{j} \tag{-}

is invalid on the notational level. This should give us serious pause. Over the course of our narrative, we have learned to trust the intimation given to us by the tensor notation. We must, therefore, accept the fact that the concept of symmetry cannot be applied to a mixed system in the above form. If we ignore what the notation is telling us, and call a mixed system symmetric if it corresponds to a symmetric matrix, we will run into contradictions later on. Specifically, suppose that a "symmetric"

A_{\cdot j}^{i}

is the manifestation of a tensor in some coordinate system or, to use the language of Linear Algebra, the manifestation of a tensor with respect to a particular basis. Then

A

transforms under a change of coordinates (or a change of basis) according to the rule

A_{\cdot j^{\prime}}^{i^{\prime}}=A_{\cdot j}^{i}J_{i}^{i^{\prime} }J_{j^{\prime}}^{j}.\tag{19.69}

Thus, if

A_{\cdot j}^{i}

corresponds to a symmetric matrix, then

A_{\cdot j^{\prime}}^{i^{\prime}}

will most likely not. For a simple illustration, note that for a symmetric matrix

\left[ \begin{array} {rr} 1 & 5\\ 5 & -2 \end{array} \right]\tag{19.70}

the combination

\left[ \begin{array} {cc} 2 & 1\\ 3 & 2 \end{array} \right] \left[ \begin{array} {rr} 1 & 5\\ 5 & -2 \end{array} \right] \left[ \begin{array} {cc} 2 & 1\\ 3 & 2 \end{array} \right] ^{-1}\tag{19.71}

equals

\left[ \begin{array} {cc} -10 & 9\\ -7 & 9 \end{array} \right] ,\tag{19.72}

which is no longer symmetric. Thus, when a mixed system

A_{\cdot j}^{i}

corresponds to a symmetric matrix, it is as much a characterization of the particular coordinate system (or basis) as it is of

A_{\cdot j}^{i}

itself. As we mentioned earlier, the Kronecker delta

\delta_{j}^{i}

is the sole exception to this rule because it corresponds to the identity matrix in all coordinate systems.

Fortunately, the tensor notation not only warns us of a potential problem, but also presents us with a clear path forward. Namely, no rules of the tensor notation would be violated if we called a system

A_{\cdot j}^{i}

symmetric when

A_{\cdot j}^{i}=A_{j}^{\cdot i}.\tag{19.73}

This works on the notational level, but what is the system

A_{j}^{\cdot i}

and how is it related to

A_{\cdot j}^{i}

? The tensor framework provides the answer to this question, as well:

A_{j}^{\cdot i}

A_{\cdot j}^{i}

with a lowered first index and a raised second index. In other words,

A_{j}^{\cdot i}=A_{\cdot s}^{r}M_{rj}M^{si},\tag{19.74}

where we are using the inner product matrix (or, in the language of tensors, the metric tensor)

M

for juggling indices. This enables us to define what it means for a mixed system

A_{\cdot j}^{i}

to be symmetric, albeit not in isolation but in the context of an inner product. In other words, in the context of the associated Euclidean space framework.

Specifically, a mixed system

A_{\cdot j}^{i}

is said to be symmetric if

A_{\cdot j}^{i}=A_{j}^{\cdot i}, \tag{19.73}

where

A_{j}^{\cdot i}

is the result of index juggling on

A_{\cdot j}^{i}

It is worth reiterating that this new definition of symmetry does not correspond to the concept of symmetry as applied to matrices. In particular, the matrix corresponding to a system

A_{\cdot j}^{i}

that satisfies the identity above, will not be symmetric, except in very special circumstances. Also note that the identity

A_{\cdot j}^{i}=A_{j}^{\cdot i} \tag{19.73}

indicates that the matrices corresponding to

A_{\cdot j}^{i}

and

A_{j}^{\cdot i}

are not equal, but rather the transposes of each other. Note, however, that for a system

A_{\cdot j}^{i}

that satisfies this identity, the placeholder may be safely omitted, i.e. the symbol

A_{\cdot j}^{i}

can be replaced with

A_{j}^{i}

. After all, the above identity tells us that it does not matter whether

A_{j}^{i}

is interpreted as

A_{\cdot j}^{i}

A_{j}^{\cdot i}

Also be reminded that, unlike systems with two subscripts or two superscripts, the concept of symmetry, as applied to mixed systems, requires the availability of an inner product, i.e. the context of a Euclidean space. To make the above definition more explicit, we could have written it in the form

A_{\cdot j}^{i}=A_{\cdot s}^{r}M_{rj}M^{si},\tag{19.75}

although this form seems to obfuscate, rather than clarify, the matter. However, this form does represent an explicit recipe for testing whether a system

A_{\cdot j}^{i}

is symmetric. For example, suppose that

M_{ij}\text{ corresponds to }\left[ \begin{array} {ccc} 1 & & \\ & 4 & \\ & & 2 \end{array} \right] .\tag{19.76}

We will now show that if the system

A_{\cdot j}^{i}\text{ corresponds to }\left[ \begin{array} {ccc} 2 & 1 & 2\\ \frac{1}{4} & \frac{3}{4} & \frac{5}{4}\\ 1 & \frac{5}{2} & \frac{7}{2} \end{array} \right]\tag{19.77}

then it is symmetric. In order to apply the definition

A_{\cdot j}^{i}=A_{\cdot s}^{r}M_{rj}M^{si}, \tag{19.75}

note that the combination

A_{\cdot s}^{r}M_{rj}M^{si}

corresponds to

MAM^{-1}

, i.e.

\left[ \begin{array} {ccc} 1 & 0 & 0\\ 0 & 4 & 0\\ 0 & 0 & 2 \end{array} \right] \left[ \begin{array} {ccc} 2 & 1 & 2\\ \frac{1}{4} & \frac{3}{4} & \frac{5}{4}\\ 1 & \frac{5}{2} & \frac{7}{2} \end{array} \right] \left[ \begin{array} {ccc} 1 & 0 & 0\\ 0 & 4 & 0\\ 0 & 0 & 2 \end{array} \right] ^{-1}=\left[ \begin{array} {ccc} 2 & \frac{1}{4} & 1\\ 1 & \frac{3}{4} & \frac{5}{2}\\ 2 & \frac{5}{4} & \frac{7}{2} \end{array} \right] .\tag{19.78}

The resulting matrix is the transpose of the matrix corresponding to

A_{\cdot j}^{i}

proving that

A_{\cdot j}^{i}

is symmetric.

The symmetry criterion can be simplified by applying a single index juggling step to the equation

A_{\cdot j}^{i}=A_{j}^{\cdot i}. \tag{19.73}

Lowering the index

i

on both sides, we find

A_{ij}=A_{ji}, \tag{19.65}

Thus,

A_{\cdot j}^{i}

is symmetric if the related system

A_{ij}

is symmetric. This is a simpler criterion since a system with two superscripts is symmetric if it corresponds to a symmetric matrix. To illustrate this criterion with the example above, note that

A_{ij}=M_{ik}A_{\cdot j}^{k}

which corresponds to the matrix product

MA

, i.e.

A_{ij}\text{ corresponds to }\left[ \begin{array} {ccc} 1 & 0 & 0\\ 0 & 4 & 0\\ 0 & 0 & 2 \end{array} \right] \left[ \begin{array} {ccc} 2 & 1 & 2\\ \frac{1}{4} & \frac{3}{4} & \frac{5}{4}\\ 1 & \frac{5}{2} & \frac{7}{2} \end{array} \right] =\left[ \begin{array} {ccc} 2 & 1 & 2\\ 1 & 3 & 5\\ 2 & 5 & 7 \end{array} \right] .\tag{19.79}

Since the resulting matrix is symmetric,

A_{ij}

is symmetric and therefore

A_{\cdot j}^{i}

is symmetric.

19.6Linear transformations

The concept of a linear transformation is one of the most fundamental elements in Linear Algebra. Recall that a linear transformation

T

can be represented by a matrix

A

in the component space. In this regard, linear transformations and inner products are similar in that both are represented by matrices in the component space: the matrix

A

in the case of linear transformations and the matrix

M

in the case of inner products. However, this similitude appears stronger than it really is and the tensor notation alerts us to a fundamental difference between the two matrices.

A linear transformation

T

maps one vector to another. The original vector is known as the preimage and the result of the transformation is known as the image. Denote the preimage by

\mathbf{x}

and its image by

\mathbf{y}

, i.e.

\mathbf{y}=T\left( \mathbf{x}\right) .\tag{19.80}

The role of the matrix

A

is to convert the components

x^{i}

of the preimage into the components

y^{j}

of the image. Thus, in the tensor notation, the matrix

A

naturally corresponds to a mixed system

A_{\cdot j}^{i}

which enables us to write

y^{i}=A_{\cdot j}^{i}x^{j}\ \ .\tag{19.81}

An inner product, on the other hand, takes two vectors

\mathbf{x}

and

\mathbf{y}

as inputs and produces a scalar. This is why the matrix

M

naturally appears with two subscripts, so as to enable the combination

M_{ij}x^{i}y^{j}\tag{19.82}

that results in

\left( \mathbf{x,y}\right)

Thus, under a change of basis, the matrices

A

and

M

transform by different rules. By the quotient theorem, discussed in Chapter 14, the variant

A_{\cdot j}^{i}

is a tensor of the type indicated by its indicial signature and therefore transforms according to

A_{\cdot j^{\prime}}^{i^{\prime}}=A_{\cdot j}^{i}J_{i}^{i^{\prime} }J_{j^{\prime}}^{j}.\tag{19.83}

By the same theorem, the variant

M_{ij}

is a tensor of the type indicated by its indicial signature and therefore transforms according to

M_{i^{\prime}j^{\prime}}=M_{ij}J_{i^{\prime}}^{i}J_{j^{\prime}}^{j}.\tag{19.84}

A^{\prime}

is the matrix representing the linear transformation

T

in the alternative basis

\mathbf{b}_{i^{\prime}}

, then the equation

A_{\cdot j^{\prime}}^{i^{\prime}}=A_{\cdot j}^{i}J_{i}^{i^{\prime} }J_{j^{\prime}}^{j} \tag{19.83}

tells us that

A^{\prime}=J^{-1}AJ,\tag{19.85}

where

J

denotes the matrix corresponding to the Jacobian

J_{j^{\prime}} ^{j}

. This formula can be found in equation

\left( 11\right)

of Chapter

2

of I.M. Gelfand's Lectures on Linear Algebra.

Similarly, if

M^{\prime}

represents the same inner product in the alternative basis

\mathbf{b}_{i^{\prime}}

, then the equation

M_{i^{\prime}j^{\prime}}=M_{ij}J_{i^{\prime}}^{i}J_{j^{\prime}}^{j} \tag{19.84}

tells us that

M^{\prime}=J^{T}MJ.\tag{19.86}

This formula can be found in equation

\left( 7\right)

of Chapter

1

of the same book.

Thus, it may be said that the matrix notation blurs the distinction between the component space representations of inner products -- or, more generally, bilinear forms -- and linear transformations since both concepts are represented by matrices. In fact, this similitude creates the temptation to consider a one-to-one correspondence between bilinear forms and linear transformation. Such a correspondence indeed exists but, in order to avoid internal contradictions, it needs to be carefully framed. The tensor notation can be instrumental in helping us establish the right correspondence -- the task to which we now turn.

19.7An equivalence between bilinear forms and linear transformations

An inner product is a special case of a more general operation known as a bilinear form

B\left( \mathbf{x},\mathbf{y}\right)

defined as an operation that takes two vectors and produces a number subject only to linearity in each argument, i.e.

B\left( \mathbf{x},\alpha\mathbf{y}+\beta\mathbf{z}\right) =\alpha B\left( \mathbf{x},\mathbf{y}\right) +\beta B\left( \mathbf{x},\mathbf{z}\right)\tag{19.87}

and

B\left( \alpha\mathbf{x}+\beta\mathbf{y},\mathbf{z}\right) =\alpha B\left( \mathbf{x},\mathbf{z}\right) +\beta B\left( \mathbf{y},\mathbf{z}\right) .\tag{19.88}

Thus, an inner product is a symmetric positive definite bilinear form, where the term symmetric refers to the commutative property. Rather than introduce a special letter to distinguish inner products from general bilinear forms, Linear Algebra simply drops the letter altogether for inner products resulting in the symbol

\left( \mathbf{x},\mathbf{y}\right)

Much like inner products, bilinear forms are represented by matrices in the component space. If

\mathbf{x}=x^{i}\mathbf{b}_{i}

and

\mathbf{y} =y^{i}\mathbf{b}_{i}

, then, by linearity,

B\left( \mathbf{x},\mathbf{y}\right) =B\left( x^{i}\mathbf{b}_{i} ,y^{j}\mathbf{b}_{j}\right) =B\left( \mathbf{b}_{i},\mathbf{b}_{j}\right) x^{i}y^{j}.\tag{19.89}

Thus, if the matrix

B

with entries

B_{ij}

is defined by

B_{ij}=B\left( \mathbf{b}_{i},\mathbf{b}_{j}\right)\tag{19.90}

then

B\left( \mathbf{x},\mathbf{y}\right) =B_{ij}x^{i}y^{j}\tag{19.91}

or, in the language of matrices,

B\left( \mathbf{x},\mathbf{y}\right) =x^{T}By.\tag{19.92}

Naturally,

B

transforms from under a change of basis by the rule

B=J^{T}BJ.\tag{19.93}

As we have already discussed, the fact that both linear transformations and bilinear forms are represented by matrices in the component space creates the temptation to propose a one-to-one correspondence between linear transformations and bilinear forms. The naive way of doing this is to state that a linear transformation

T

corresponds to a bilinear form

B

if the two are represented by the same matrix. However, this approach is clearly self-contradictory since matrices representing linear transformations and bilinear forms transform differently under a change of basis. As a result, a linear transformation and a bilinear form that happen to be represented by the same matrix with respect to a particular basis, will most certainly be represented by two different matrices with respect to another basis.

As it is often the case, the remedy is found by going back to the Euclidean framework where one can develop the idea in a geometric setting. For a linear transformation

T

, define the corresponding bilinear form

B

by the equation

B\left( \mathbf{x},\mathbf{y}\right) =\left( \mathbf{x},T\left( \mathbf{y}\right) \right) .\tag{19.94}

In other words,

B\left( \mathbf{x},\mathbf{y}\right)

is the inner product between

\mathbf{x}

and

T\left( \mathbf{y}\right)

Let us now use the tensor approach to determine the relationship between the matrix

B

representing the bilinear form with respect to a basis

\mathbf{b}_{i}

and the matrix

A

representing the linear transformation with respect to the same basis. We have

B\left( \mathbf{x},\mathbf{y}\right) =B_{ij}x^{i}y^{j},\tag{19.95}

while

T\left( \mathbf{y}\right) =A_{\cdot j}^{i}y^{j}\mathbf{b}_{i}.\tag{19.96}

Therefore,

\left( \mathbf{x},T\left( \mathbf{y}\right) \right) =M_{ik}A_{\cdot j} ^{i}x^{k}y^{j}.\tag{19.97}

Switch the names of the indices

i

and

k

so that

x^{k}

appears as

x^{i}

, i.e.

\left( \mathbf{x},T\left( \mathbf{y}\right) \right) =M_{ki}A_{\cdot j} ^{k}x^{i}y^{j}.\tag{19.98}

Since, by construction,

B\left( \mathbf{x},\mathbf{y}\right)

equals

\left( \mathbf{x},T\left( \mathbf{y}\right) \right)

for all

\mathbf{x}

and

\mathbf{y}

, we conclude that

B_{ij}x^{i}y^{j}=M_{ki}A_{\cdot j}^{k}x^{i}y^{j}\tag{19.99}

for all

x^{i}

and

y^{j}

. Therefore,

B_{ij}=M_{ki}A_{\cdot j}^{k}\ ,\tag{19.100}

which is precisely the relationship we set out to determine. Invoking index jugging by the "metric tensor"

M_{ki}

, we arrive at the simple relationship

B_{ij}=A_{ij}.\tag{19.101}

Thus, somewhat ironically, the naive approach of associating a linear transformation with the bilinear form represented by the same matrix is not too far off. The only adjustment that one needs to make is to require -- to use tensor terminology -- the lowering of the superscript on the matrix representing the linear transformation.

19.8Self-adjoint transformations

A linear transformation

T

is called self-adjoint if the identity

\left( \mathbf{x},T\left( \mathbf{y}\right) \right) =\left( T\left( \mathbf{x}\right) ,\mathbf{y}\right)\tag{19.102}

holds for all vectors

\mathbf{x}

and

\mathbf{y}

. In other words, the inner product of

\mathbf{x}

and

T\left( \mathbf{y}\right)

is the same as that of

T\left( \mathbf{x}\right)

and

\mathbf{y}

, i.e. it does not matter whether

T

is applied to

\mathbf{x}

\mathbf{y}

. The concept of a self-adjoint matrix is interesting for its "reverse commute": while most ideas in Linear Algebra were developed for vectors and subsequently extended to the component space, the concept of a self-adjoint matrix clearly arose out of symmetric matrices and their special properties. But are self-adjoint linear transformations necessarily represented by symmetric matrices? We will now answer this question.

As we discovered in the previous Section, the inner product

\left( \mathbf{x},T\left( \mathbf{y}\right) \right)

is given by

\left( \mathbf{x},T\left( \mathbf{y}\right) \right) =M_{ki}A_{\cdot j} ^{k}x^{i}y^{j}.\tag{19.103}

Similarly,

\left( T\left( \mathbf{x}\right) ,\mathbf{y}\right)

is given by

\left( T\left( \mathbf{x}\right) ,\mathbf{y}\right) =M_{kj}A_{\cdot i} ^{k}x^{i}y^{j}.\tag{19.104}

Since, for a self-adjoint

T

\left( \mathbf{x},T\left( \mathbf{y}\right) \right)

equals

\left( T\left( \mathbf{x}\right) ,\mathbf{y}\right)

for all

\mathbf{x}

and

\mathbf{y}

, we conclude that

M_{ki}A_{\cdot j}^{k}x^{i}y^{j}=M_{kj}A_{\cdot i}^{k}x^{i}y^{j}\tag{19.105}

for all

x^{i}

and

y^{i}

. Therefore

M_{ki}A_{\cdot j}^{k}=M_{kj}A_{\cdot i}^{k},\tag{19.106}

which is precisely the characterization of the matrix

A

we were looking for. In other words, the linear transformation

T

is self-adjoint if it is represented by a

A_{\cdot j}^{i}

that is symmetric in the sense of the definition given in Section 19.5. To see this, allow the inner product matrix

M

(i.e. the "metric tensor")

\

to lower the superscript

k

on both sides, leading to the more recognizable form

A_{ij}=A_{ji}.\tag{19.107}

Raising

i

on both sides also yields

A_{\cdot j}^{i}=A_{j}^{\cdot i}.\tag{19.108}

As we have already pointed out, in the language of matrices, the criterion for a matrix representing a self-adjoint transformation reads

A=MA^{T}M^{-1}.\tag{19.109}

19.9Exercises

Exercise 19.1Show that the tensor equation

C_{j}^{\cdot k}=A_{\cdot j}^{i}B_{\cdot i}^{k}\tag{19.110}

corresponds to the matrix equation

C^{T}=BA\tag{19.111}

or, equivalently,

C=A^{T}B^{T}.\tag{19.112}

Exercise 19.2Show that the tensor equation

C_{ki}=A_{ij}B_{\cdot k}^{j}\tag{19.113}

corresponds to the matrix equation

C^{T}=AB.\tag{19.114}

Exercise 19.3Show that, for a symmetric matrix

A

and any matrix

X

, the combination

XAX^{T}\tag{19.115}

is symmetric. Thus, if a tensor

A_{ij}

is symmetric in a component space corresponding to one basis, it is symmetric in a component space corresponding to any basis.

Exercise 19.4If the component space basis

\mathbf{b}_{i}

is orthonormal, show that a symmetric system

A_{\cdot j}^{i}

is represented by a symmetric matrix.