Linear Algebra, Matrices, and the Tensor Notation

This Chapter connects some of the essential ideas in Linear Algebra with concepts in Tensor Calculus and, in turn, discusses how to express indicial equations involving first- and second-order systems in matrix terms. This Chapter may present a challenging read as it attempts to answer questions that the reader may not have asked. After all, as we will state early in this Chapter, Tensor Calculus does not need matrices. As a matter of fact, Linear Algebra and matrices benefit from the ideas of Tensor Calculus more than the other way around. Nevertheless, I hope that the reader gives this Chapter a thorough read at this time and returns to it later when the questions discussed here arise naturally at some point in the future.

19.1.1Sidestepping matrices

Throughout our narrative, we have tried, as much as possible, to avoid the language of matrices. This was, in part, to show that the structures that present themselves in the course of a tensorial analysis can be handled without the use of matrices. Scalar variants of any order can be thought of simply as indexed lists of numbers that need not be organized into tables. For example, it has been sufficient to summarize the 2727 elements of the Christoffel symbol Γjki\Gamma_{jk}^{i} in cylindrical coordinates by listing its three nonzero elements
Γ221=r    and    Γ122=Γ212=1r.(12.52)\Gamma_{22}^{1}=-r\text{\ \ \ \ and\ \ \ \ }\Gamma_{12}^{2}=\Gamma_{21} ^{2}=\frac{1}{r}. \tag{12.52}
This economic way of capturing Γjki\Gamma_{jk}^{i} proved quite effective in helping us interpret the formula for the Laplacian
iiU=Zij(2UZiZjΓijmUZm)(18.57)\nabla_{i}\nabla^{i}U=Z^{ij}\left( \frac{\partial^{2}U}{\partial Z^{i}\partial Z^{j}}-\Gamma_{ij}^{m}\frac{\partial U}{\partial Z_{m}}\right) \tag{18.57}
in cylindrical coordinates, where
iiU=1rr(rUr)+1r22Uθ2+2Uz2.(19.1)\nabla_{i}\nabla^{i}U=\frac{1}{r}\frac{\partial}{\partial r}\left( r\frac{\partial U}{\partial r}\right) +\frac{1}{r^{2}}\frac{\partial^{2} U}{\partial\theta^{2}}+\frac{\partial^{2}U}{\partial z^{2}}.\tag{19.1}
That calculation was a microcosm of the general realization that Tensor Calculus does not need Matrix Algebra.

19.1.2The advantages of matrices

Yet, a few of the distinct advantages of Matrix Algebra are impossible to ignore. First, the Matrix Algebra treatment of systems as whole indivisible units helps stimulate our algebraic intuition which is deeply ingrained in our mathematical culture. Consider, for example, the concept of the matrix inverse and the identity for the inverse of the product of two matrices:
(AB)1=B1A1.(19.2)\left( AB\right) ^{-1}=B^{-1}A^{-1}.\tag{19.2}
This identity is a concise expression of the fundamental idea that to reverse a combination of two actions, we must combine the individual reverse actions in the opposite order. For example, if our trip to work consists of traveling by train followed by traveling by bus then the trip home will require traveling back by bus followed by traveling back by train. The indicial notation does not have an effective way of capturing this elementary idea.
To prove the formula
(AB)1=B1A1,(19.2)\left( AB\right) ^{-1}=B^{-1}A^{-1}, \tag{19.2}
simply multiply B1A1B^{-1}A^{-1} by ABAB. We have
B1A1AB=B1(A1A)B=B1IB=B1B=I,(19.3)B^{-1}A^{-1}AB=B^{-1}\left( A^{-1}A\right) B=B^{-1}IB=B^{-1}B=I,\tag{19.3}
as we set out to show. In this concise derivation, the great utility of treating matrices as indivisible whole units is on full display. Furthermore, the formula
(AB)1=B1A1(19.2)\left( AB\right) ^{-1}=B^{-1}A^{-1} \tag{19.2}
can be used to derive the analogous equation for the inverse (ABC)1\left( ABC\right) ^{-1} by the following chain of identities
(ABC)1=((AB)C)1=C1(AB)1=C1(B1A1)=C1B1A1,(19.4)\left( ABC\right) ^{-1}=\left( \left( AB\right) C\right) ^{-1} =C^{-1}\left( AB\right) ^{-1}=C^{-1}\left( B^{-1}A^{-1}\right) =C^{-1}B^{-1}A^{-1},\tag{19.4}
where the justification of each step is left as an exercise. In summary,
(ABC)1=C1B1A1.(19.5)\left( ABC\right) ^{-1}=C^{-1}B^{-1}A^{-1}.\tag{19.5}
From this derivation, it is clear that the inverse of the product of any number of matrices equals the product of the individual inverses in the opposite order. This derivation is not only straightforward, but also highly insightful from the algebraic point of view. Meanwhile, the same set of ideas cannot be effectively expressed in the indicial notation.
We should also call attention to another purely algebraic fact which we have relied upon in our narrative to a significant degree. Namely, it is the fact that
AB=I    implies    BA=I.(19.6)AB=I\text{ \ \ \ implies\ \ \ \ }BA=I.\tag{19.6}
It is because of this fact that we have been able to contract the product of the covariant metric tensor ZijZ_{ij} and the contravariant metric tensor ZijZ^{ij} on any valid combination of indices to produce the Kronecker delta.
Let us give the classical matrix-based proof of this fact which will once again demonstrate the utility of the algebraic way of thinking afforded to us by matrices. It follows from Linear Algebra considerations that if AA is a square matrix with linearly independent columns then there exists a unique matrix RR such that
AR=I.(19.7)AR=I.\tag{19.7}
We will refer to RR as the "right inverse" of AA. A square matrix with linearly independent columns has linearly independent rows and therefore there exists a unique matrix LL, referred to as the left inverse of AA, such that
LA=I.(19.8)LA=I.\tag{19.8}
Our goal is to show that the left inverse and the right inverse are the same matrix, i.e.
L=R.(19.9)L=R.\tag{19.9}
Subsequently, if we denote the common value of LL and RR by BB, then the equivalence of ARAR and LALA (both being the identity matrix) becomes the equivalence of ABAB and BABA.
To show that
L=R,(19.9)L=R, \tag{19.9}
multiply both sides of the identity
LA=I(19.8)LA=I \tag{19.8}
by RR on the right, i.e.
LAR=IR.(19.10)LAR=IR.\tag{19.10}
Since AR=IAR=I, LI=LLI=L, and IR=RIR=R, we find
L=R(19.9)L=R \tag{19.9}
as we set out to show. Note that we glossed over an application of the associative property of matrix multiplication. Indeed, multiplying LALA by RR yields (LA)R\left( LA\right) R. By the associative property, we find that (LA)R=L(AR)\left( LA\right) R=L\left( AR\right) and the rest of the argument can proceed as described above. Thus, the equivalence of the left and right inverses is a direct consequence of the associative property of multiplication.
Much like the preceding proofs, this proof, too, cannot be effectively carried out in indicial notation. In fact, one could argue that this fundamental property of matrices cannot be discovered in indicial notation. This proof completes our discussion of the significant logical advantages of the matrix notation.
Another advantage of the matrix notation is that there are numerous software packages that implement various matrix operations. Consequently, any calculation formulated in terms of matrices, matrix products, and even more advanced matrix operations, can then be very effectively carried out by computers. Most readers can probably hardly remember the last time they multiplied two matrices by hand.
Finally, we must acknowledge the ubiquity of the language of matrices. Matrices represent our go-to way of visualizing data, economizing notation, and injecting algebraic structure into otherwise unstructured frameworks in a way that enables the leveraging of ideas from Linear Algebra.
Tensor Calculus which, as we stated, does not need matrices, is a case in point. After all, it is practically impossible not to think of first- and second-order systems as matrices. We have consistently associated metric tensors with matrices, even though we preferred the phrase corresponds to to the word equal in order to preserve the logical separation between systems and matrices. For example, we would say that, in spherical coordinates,
Zij corresponds to [1r2r2sin2θ].(9.45)Z_{ij}\text{ \textit{corresponds} to }\left[ \begin{array} {ccc} 1 & & \\ & r^{2} & \\ & & r^{2}\sin^{2}\theta \end{array} \right] . \tag{9.45}
Consequently, in spherical coordinates, the dot product UV\mathbf{U} \cdot\mathbf{V} of vectors U\mathbf{U} and V\mathbf{V} with components UiU^{i} and ViV^{i}, given by the tensor identity
UV=ZijUiVj,(10.25)\mathbf{U}\cdot\mathbf{V}=Z_{ij}U^{i}V^{j}, \tag{10.25}
can be expressed in the language of matrices as follows:
UV=[U1U2U3][1r2r2sin2θ][V1V2V3].(19.11)\mathbf{U}\cdot\mathbf{V}= \begin{array} {c} \left[ \begin{array} {ccc} U^{1} & U^{2} & U^{3} \end{array} \right] \\ \\ \end{array} \left[ \begin{array} {ccc} 1 & & \\ & r^{2} & \\ & & r^{2}\sin^{2}\theta \end{array} \right] \left[ \begin{array} {c} V^{1}\\ V^{2}\\ V^{3} \end{array} \right] .\tag{19.11}
This form was frequently mentioned in Chapter 10. To many readers it likely helped clarify the interpretation of the then-novel expression ZijUiVjZ_{ij}U^{i}V^{j}.

19.1.3The advantages of the tensor notation

Having just described the indisputable benefits of the language of matrices, we ought to reiterate some of the advantages of the tensor notation.
First, by virtue of treating systems not as whole indivisible units but as collections of individual elements, the tensor notation allows direct access to those elements. We have taken great advantage of this feature on numerous occasions, including the analysis of quadratic form minimization in Section 8.7 and in establishing the derivative of the volume element Z\sqrt{Z} with respect to the coordinate ZiZ^{i} in Section 16.12.
Second, the tensor notation continues to work for systems of order greater than two. This is the case for a number of objects that we have already encountered, including the Christoffel symbol Γjki\Gamma_{jk}^{i}, the Levi-Civita symbols εijk\varepsilon_{ijk} and εijk\varepsilon^{ijk}, and the Riemann-Christoffel tensor RklijR_{klij}. In order for an identity to be beyond the scope of the matrix notation, it need not be as complicated as the celebrated tensor equation
ijTkjiTk=(ΓjmkZiΓimkZj+ΓinkΓjmnΓjnkΓimn)Tm.(15.120)\nabla_{i}\nabla_{j}T^{k}-\nabla_{j}\nabla_{i}T^{k}=\left( \frac {\partial\Gamma_{jm}^{k}}{\partial Z^{i}}-\frac{\partial\Gamma_{im}^{k} }{\partial Z^{j}}+\Gamma_{in}^{k}\Gamma_{jm}^{n}-\Gamma_{jn}^{k}\Gamma _{im}^{n}\right) T^{m}. \tag{15.120}
Even an identity involving second-order systems may require the tensor notation. For example, the identity
airajsaisajr=Aεijεrs(17.22)a_{ir}a_{js}-a_{is}a_{jr}=A\varepsilon_{ij}\varepsilon_{rs} \tag{17.22}
requires the tensor notation since it has four live indices.
Last but not least, the tensor notation has its own unique way of inspiring algebraic intuition. In particular, the natural placement of indices always accurately predicts the way an object transforms under a change of coordinates and suggests the ways in which it can be meaningfully combined with other objects. This feature of the tensor notation will be showcased throughout this Chapter.
In Section 2.7, we contrasted our approach to vectors to that often taken by Linear Algebra. According to our approach, vectors are directed segments, subject to addition according to the tip-to-tail rule and multiplication by numbers. It can be demonstrated that these operations satisfy a number of desirable properties, such as associativity and distributivity. In Linear Algebra, vectors are defined as generic objects subject to abstract operations of addition and multiplication by numbers that, by definition, satisfy the same set of desirable properties. That makes geometric vectors, i.e. directed segments, a special case of the abstract vector as defined by Linear Algebra. To distinguish between the two categories of vectors, we have used bold capital letters, such as U\mathbf{U}, V\mathbf{V}, and W\mathbf{W}, for geometric vectors, and bold lowercase letters, such as x\mathbf{x}, y\mathbf{y}, and z\mathbf{z}, for vectors in the sense of Linear Algebra.
Tensor Calculus and Linear Algebra take similarly different approaches to the related concepts of the dot product and inner product. The dot product of two geometric vectors U\mathbf{U} and V\mathbf{V} is defined by the equation
UV=lenUlenVcosγ.(2.14)\mathbf{U}\cdot\mathbf{V}=\operatorname{len}\mathbf{U}\operatorname{len} \mathbf{V}\cos\gamma. \tag{2.14}
It can be demonstrated that the dot product satisfies commutativity,
UV=VU.(2.15)\mathbf{U}\cdot\mathbf{V}=\mathbf{V}\cdot\mathbf{U.} \tag{2.15}
and distributivity
U(αV+βW)=αUV+βUW.(2.18)\mathbf{U}\cdot\left( \alpha\mathbf{V}+\beta\mathbf{W}\right) =\alpha \mathbf{U}\cdot\mathbf{V}+\beta\mathbf{U}\cdot\mathbf{W.} \tag{2.18}
The distributive property can also be referred to as linearity and is often described by stating that the dot product is linear in each argument. It is also obvious that the dot product of a nonzero vector U\mathbf{U} with itself is positive, i.e.
UU>0,   if   U0.(19.12)\mathbf{U}\cdot\mathbf{U}\gt 0\text{, \ \ if\ \ \ }\mathbf{U}\neq\mathbf{0.}\tag{19.12}
This property, known as positive definiteness, is hardly worthy of mention for geometric vectors but becomes part of the definition for generic vectors.
The Linear Algebra concept of an inner product is an axiomatic adaptation of the classical dot product. An inner product is an operation that takes two vectors and produces a number. The inner product of vectors x\mathbf{x} and y\mathbf{y} is denoted by (x,y)\left( \mathbf{x} ,\mathbf{y}\right) and is defined by commutativity
(x,y)=(y,x),(2.90)\left( \mathbf{x},\mathbf{y}\right) =\left( \mathbf{y},\mathbf{x}\right) , \tag{2.90}
distributivity, also known as linearity,
(x,αy+βz)=α(x,y)+β(x,z),(2.91)\left( \mathbf{x},\alpha\mathbf{y}+\beta\mathbf{z}\right) =\alpha\left( \mathbf{x},\mathbf{y}\right) +\beta\left( \mathbf{x},\mathbf{z}\right) , \tag{2.91}
and positive definiteness
(x,x)>0,    if    x0.(2.92)\left( \mathbf{x},\mathbf{x}\right) \gt 0,\text{\ \ \ \ if\ \ \ \ } \mathbf{x}\neq\mathbf{0.} \tag{2.92}
A vector space endowed with an inner product is called a Euclidean space -- so close is the analogy with Euclidean spaces as we introduced them in Chapter 2.
The final difference between our approach to vectors and that of Linear Algebra is the way in which a basis emerges. In Tensor Calculus, a coordinate system is chosen arbitrarily and the covariant basis Zi\mathbf{Z}_{i} is constructed by differentiating the position vector function R(Z)\mathbf{R}\left( Z\right) with respect to the coordinate ZiZ^{i}, i.e.
Zi=RZi.(9.9)\mathbf{Z}_{i}=\frac{\partial\mathbf{R}}{\partial Z^{i}}. \tag{9.9}
Thus, the covariant basis (at a given point) is specific to the chosen coordinate system. The dependence of Zi\mathbf{Z}_{i} on the choice of coordinates is underscored by the term variant. In Linear Algebra, a basis bi\mathbf{b}_{i} is arbitrarily selected. Any complete linearly independent set of vectors represents a legitimate basis. Thus, in both approaches we can talk about a change of basis, although in Tensor Calculus, a change of basis is induced by a change of coordinates.
Having contrasted the origins of the key concepts, we must now point out that their uses in the two subjects are almost identical. Tensor Calculus and Linear Algebra share the concept of the component space in which analysis is performed in terms of the components of vectors rather than vectors themselves. For example, with the help of the covariant metric tensor ZijZ_{ij} defined by
Zij=ZiZj,(14.5)Z_{ij}=\mathbf{Z}_{i}\cdot\mathbf{Z}_{j}, \tag{14.5}
the component space expression for the dot product UV\mathbf{U}\cdot\mathbf{V} of vectors U\mathbf{U} and V\mathbf{V} with components UiU^{i} and ViV^{i} reads
UV=ZijUiVj.(10.25)\mathbf{U}\cdot\mathbf{V}=Z_{ij}U^{i}V^{j}. \tag{10.25}
Similarly, the inner product matrix MM, whose entries MijM_{ij} are
Mij=(bi,bj),(19.13)M_{ij}=\left( \mathbf{b}_{i}\mathbf{,b}_{j}\right) ,\tag{19.13}
can be used to express the inner product (x,y)\left( \mathbf{x,y}\right) of vectors x\mathbf{x} and y\mathbf{y} with components xix^{i} and yiy^{i} as follows:
(x,y)=Mijxiyj.(19.14)\left( \mathbf{x,y}\right) =M_{ij}x^{i}y^{j}.\tag{19.14}
In the language of matrices, which is one of the topics discussed in this Chapter, the same operation is captured by the equation
(x,y)=xTMy(19.15)\left( \mathbf{x,y}\right) =x^{T}My\tag{19.15}
where xx and yy are the column matrices consisting of the elements xix^{i} and yiy^{i}.
The covariant metric tensor ZijZ_{ij} and matrix MM are essentially the same object, differing only in notation and the terminology used to describe them. We will therefore use MM in many of the same ways that we have used ZijZ_{ij}, including utilizing the symbol MijM^{ij} to denote the entries of M1M^{-1}, and performing index juggling by contracting with MijM_{ij} and MijM^{ij}.
As we have already mentioned, only first- and second-order systems can be effectively represented by matrices. As a matter of convention, let us agree to represent first-order systems by n×1n\times1 matrices. Thus, a system xix^{i} in a three-dimensional space will be represented by the column matrix
[x1x2x3].(19.16)\left[ \begin{array} {c} x^{1}\\ x^{2}\\ x^{3} \end{array} \right] .\tag{19.16}
We could have also represented xix^{i} by a row matrix [x1  x2  x3]\left[ x^{1}\ \ x^{2}\ \ x^{3}\right] . However, limiting ourselves to column matrices only will help reduce the number of possible matrix representations of contractions.
The flavor of the index, whether it indicates a tensor property of the corresponding variant or is used for convenience, has no bearing on the matrix representation of a system. Thus, a system xix_{i} is also represented by a column matrix, i.e.
[x1x2x3].(19.17)\left[ \begin{array} {c} x_{1}\\ x_{2}\\ x_{3} \end{array} \right] .\tag{19.17}
Second-order systems correspond to square n×nn\times n matrices. Note that in Chapter 8, we encountered systems that correspond to rectangular matrices. As a matter of fact, systems of that sort will begin to arise naturally when we study embedded surfaces in the next volume. Nevertheless, in this Chapter, we will limit our focus to square matrices. However, all points made in this Chapter will remain valid for systems corresponding to rectangular matrices.
As we have done throughout the book, and as the most commonly accepted convention dictates, the first index of the system corresponds to the row the element is in while the second index corresponds to the column. Thus, it is essential to have complete clarity as to the order of indices. For systems with two superscripts or two subscripts, the order is obvious. For mixed systems, i.e. systems with one superscript and one subscript, the order of the indices is usually indicated by the dot placeholder technique introduced in Section 7.2. For example, in the symbols
Aji(19.18)A_{\cdot j}^{i}\tag{19.18}
and
Aij(19.19)A_{i}^{\cdot j}\tag{19.19}
the dot makes it clear that ii is the first index and jj is second.
Interestingly, we have not seen the dot placeholder that much in our narrative so far. The reason for this is that the only mixed second-order system that we have consistently encountered is the Kronecker delta δji\delta_{j}^{i} which corresponds to the symmetric matrix
[100010001].(19.20)\left[ \begin{array} {ccc} 1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1 \end{array} \right] .\tag{19.20}
and, therefore, the order of the indices does not matter. However, as we will describe below in Section 19.5, this exception can be made only for symmetric systems, in the sense of the term symmetric that differs from that for matrices.
Another category of mixed second-order systems for which the use of the placeholder is not required is the Jacobians JiiJ_{i^{\prime}}^{i} and JiiJ_{i}^{i^{\prime}}, for which we can simply agree that the superscript is first and the subscript is second. What makes this convention reliable in this case is the fact that the indices of a Jacobian are never juggled. In other words, the symbol JiiJ_{i^{\prime}}^{i} is never used to represent the combination JjjZijZijJ_{j}^{j^{\prime}}Z_{i^{\prime}j^{\prime}}Z^{ij}. Had index juggling been allowed, the symbol JiiJ_{i^{\prime}}^{i} would be ambiguous and the use of the dot placeholder would be in order. More generally, the possibility of index juggling is the very reason for not being able to use the flavors of indices as a mechanism for determining their order.
Any combination of contractions involving first- and second-order systems can be represented by matrix multiplication. In this Section, we will review the basic mechanics of matrix multiplication and subsequently show how some of the most common contractions can be expressed by matrix products.

19.4.1The mechanics of matrix multiplication

Consider three matrices AA, BB, and CC with entries AijA_{ij}, BijB_{ij} and CijC_{ij}. Suppose that AA is an l×ml\times m matrix, BB is m×nm\times n, and CC is l×nl\times n. Then, by definition, CC is the product of AA and BB, i.e.
C=AB,(19.21)C=AB,\tag{19.21}
if
Cij=k=1mAikBkj.(19.22)C_{ij}=\sum_{k=1}^{m}A_{ik}B_{kj}.\tag{19.22}
Note that we are following the Linear Algebra tradition of using only subscripts to reference the individual entries of a matrix.
An essential element of matrix multiplication is that the summation takes place over the second index of AA and the first index of BB. Since the first index indicates the row of the entry and the second indicates the column, matrix multiplication combines the ii-th row of AA with the jj-th column of BB to produce CijC_{ij}, i.e. the entry in the ii-th row and jj-th column of CC. These mechanics are illustrated in the following figure:
ji[Cij]=i[Aik]j[Bkj](19.23)\begin{array} {r} \begin{array} {r} j\hspace{0.4in}\\ \downarrow\hspace{0.4in} \end{array} \\ \begin{array} {c} \phantom{\square} \\ \phantom{\square} i\rightarrow\\ \phantom{\square} \\ \phantom{\square} \\ \phantom{\square} \end{array} \hspace{-0.1in}\left[ \begin{array} {ccc} \square & \square & \square\\ \square & C_{ij} & \square\\ \square & \square & \square\\ \square & \square & \square\\ \square & \square & \square \end{array} \right] \end{array} =\hspace{-0.1in} \begin{array} {r} \begin{array} {ccc} & & \\ & & \end{array} \\ \begin{array} {c} \phantom{\square} \\ \phantom{\square} i\rightarrow\\ \phantom{\square} \\ \phantom{\square} \\ \phantom{\square} \end{array} \hspace{-0.1in}\left[ \begin{array} {cccc} \square & \square & \square & \square\\ \blacksquare & A_{ik} & \blacksquare & \blacksquare\\ \square & \square & \square & \square\\ \square & \square & \square & \square\\ \square & \square & \square & \square \end{array} \right] \end{array} \hspace{-0.1in} \begin{array} {r} \begin{array} {r} j\hspace{0.42in}\\ \downarrow\hspace{0.42in} \end{array} \\ \left[ \begin{array} {ccc} \square & \blacksquare & \square\\ \square & B_{kj} & \square\\ \square & \blacksquare & \square\\ \square & \blacksquare & \square \end{array} \right] \\ \\ \end{array}\tag{19.23}
Thus, the mechanics of matrix multiplication are very rigid and it is up to us to use the two available "levers" -- the order of the operands and the transpose -- to make sure the products reflect the contractions in a given tensor expression. We will now go through a series of examples converting contractions to matrix products in increasing order of complexity.

19.4.2Contraction of first-order systems

Let us start with the contraction
xiyi(19.24)x_{i}y^{i}\tag{19.24}
that represents the inner product of vectors x\mathbf{x} and y\mathbf{y}. Since xix_{i} corresponds to
x=[x1x2x3](19.25)x=\left[ \begin{array} {c} x_{1}\\ x_{2}\\ x_{3} \end{array} \right]\tag{19.25}
and yiy^{i} corresponds to
y=[y1y2y3],(19.26)y=\left[ \begin{array} {c} y^{1}\\ y^{2}\\ y^{3} \end{array} \right] ,\tag{19.26}
the combination xiyix_{i}y^{i} corresponds either to
xTy=[x1x2x3][y1y2y3]     or     yTx=[y1y2y3][x1x2x3].(19.27)x^{T}y= \begin{array} {c} \left[ \begin{array} {ccc} x_{1} & x_{2} & x_{3} \end{array} \right] \\ \\ \end{array} \left[ \begin{array} {c} y^{1}\\ y^{2}\\ y^{3} \end{array} \right] \text{ \ \ \ \ or \ \ \ \ }y^{T}x= \begin{array} {c} \left[ \begin{array} {ccc} y^{1} & y^{2} & y^{3} \end{array} \right] \\ \\ \end{array} \left[ \begin{array} {c} x_{1}\\ x_{2}\\ x_{3} \end{array} \right] .\tag{19.27}
In either case, the matrix product involves the transpose of one of the matrices. We will find this to be the case in the more complicated situations as well.
Note that the tensor product xiyjx_{i}y^{j} corresponds to the matrix product
 xyT=[x1x2x3][y1y2y3](19.28)\text{\ }xy^{T}=\left[ \begin{array} {c} x_{1}\\ x_{2}\\ x_{3} \end{array} \right] \begin{array} {c} \left[ \begin{array} {ccc} y^{1} & y^{2} & y^{3} \end{array} \right] \\ \\ \end{array}\tag{19.28}
Thus, xiyix_{i}y^{i} can also be captured evaluating the trace of the resulting matrix, i.e.
xiyi=trace([x1x2x3][y1y2y3]).(19.29)x_{i}y^{i}=\operatorname{trace}\left( \left[ \begin{array} {c} x_{1}\\ x_{2}\\ x_{3} \end{array} \right] \begin{array} {c} \left[ \begin{array} {ccc} y^{1} & y^{2} & y^{3} \end{array} \right] \\ \\ \end{array} \right) .\tag{19.29}

19.4.3Contraction of a second-order system with a first-order system

Let us now turn our attention to the contraction
Ajixj(19.30)A_{\cdot j}^{i}x^{j}\tag{19.30}
involving a second-order system AjiA_{\cdot j}^{i} and a first-order system xjx^{j}. If AA is the matrix corresponding to AjiA_{\cdot j}^{i} and xx is the matrix corresponding to xix^{i}, then
Ajixj corresponds to Ax.(19.31)A_{\cdot j}^{i}x^{j}\text{ corresponds to }Ax.\tag{19.31}
Note that the convention that the first index indicates the row while the second indicates the column is essential for reaching this conclusion. Also note that the alternative product xTATx^{T}A^{T} results in the same values, although arranged into a 1×n1\times n row matrix. Therefore, by our convention, where first-order systems are represented by column matrices, xTATx^{T}A^{T} does not properly represent the resulting system.
As we already mentioned, the flavor of the indices has no bearing on the matrix representation. Therefore, the combinations
AijxjAijxj, and Aijxj(19.32)A^{ij}x_{j}\text{, }A_{i}^{\cdot j}x_{j}\text{, and }A_{ij}x^{j}\tag{19.32}
which represent variations in the flavors of indices but not their order, are also represented by the product AxAx.
The contraction
Ajixi,(19.33)A_{\cdot j}^{i}x_{i},\tag{19.33}
on the other hand, is principally different in that it is the first index of AjiA_{\cdot j}^{i} that is engaged in the contraction. As a result, the only way to represent AjixiA_{\cdot j}^{i}x_{i} by a matrix product is
ATx.(19.34)A^{T}x.\tag{19.34}
As before, the product xTAx^{T}A produces the same values as a row matrix and therefore does not properly represent the result.
Let us now express the transformation rules for first-order tensors in matrix form. Recall from Chapter 17, that we let
J denote the matrix corresponding to the Jacobian Jii.(17.9)J\text{ denote the matrix corresponding to the Jacobian }J_{i^{\prime}}^{i}. \tag{17.9}
Now suppose that TiT^{i} is a contravariant tensor, i.e.
Ti=TiJii.(19.35)T^{i^{\prime}}=T^{i}J_{i}^{i^{\prime}}.\tag{19.35}
Since first-order tensors correspond to column matrices, we will use lowercase letters to denote those matrices. If the matrix xx represents TiT^{i} and xx^{\prime} represents TiT^{i^{\prime}}, then
x=J1x.(19.36)x^{\prime}=J^{-1}x.\tag{19.36}
Similarly, if TiT_{i} is a covariant tensor, i.e.
Ti=TiJii,(19.37)T_{i^{\prime}}=T_{i}J_{i^{\prime}}^{i},\tag{19.37}
then its matrix representations xx and xx^{\prime} are related by the equation
x=Jx.(19.38)x^{\prime}=Jx.\tag{19.38}

19.4.4Contraction of second-order systems

A contraction of two second-order systems produces another second-order system. Thus we expect that a tensor identity featuring such a contraction corresponds to one of the many variations of the matrix equation
C=AB(19.39)C=AB\tag{19.39}
that differ by the order of AA and BB and their transposition. In fact, we can construct 1616 different variations, such as
C=ATBC=ABTCT=BACT=BAT(19.40)C=A^{T}B\text{, }C=AB^{T}\text{, }C^{T}=BA\text{, }C^{T}=BA^{T}\tag{19.40}
and so on. The fact that
(AB)T=BTAT(19.41)\left( AB\right) ^{T}=B^{T}A^{T}\tag{19.41}
makes half of the 1616 variations equivalent. For example,
C=ABT  and  CT=BAT,(19.42)C=AB^{T}\text{ \ and \ }C^{T}=BA^{T},\tag{19.42}
are equivalent as can be seen by taking the transpose of both sides of the first equation. This reduces the number of truly distinct equations to 88 and also means that every indicial relationship can be captured in two different, albeit equivalent, ways.
Recall, once again, that the flavors of indices have no bearing on the corresponding matrix representations. Thus, we will choose the placements of indices completely at will.
Consider three second-order systems represented by the matrices AA, BB, and CC. For our first example, consider the relationship
Cji=AkiBjk.(19.43)C_{\cdot j}^{i}=A_{\cdot k}^{i}B_{\cdot j}^{k}.\tag{19.43}
This relationship very clearly corresponds to the matrix identity
C=AB.(19.44)C=AB.\tag{19.44}
However, if the order of the indices on CjiC_{\cdot j}^{i} is switched,i.e.
Cji=AkiBjk,(19.45)C_{j}^{\cdot i}=A_{\cdot k}^{i}B_{\cdot j}^{k},\tag{19.45}
then the corresponding matrix identity is
CT=AB(19.46)C^{T}=AB\tag{19.46}
or, equivalently,
C=BTAT.(19.47)C=B^{T}A^{T}.\tag{19.47}
As the above two examples illustrate, the question of representing the contraction AkiBjkA_{\cdot k}^{i}B_{\cdot j}^{k} by a matrix product has two legitimate answers: ABAB and BTATB^{T}A^{T}, even though these expressions result in two different matrices. The two different interpretations are possible because the combination AkiBjkA_{\cdot k}^{i}B_{\cdot j}^{k}, on its own, does not give us an indication of the order of the indices in the resulting system. If, in the result, ii is the first index and jj is the second, then AkiBjkA_{\cdot k}^{i}B_{\cdot j}^{k} corresponds to ABAB. Otherwise, AkiBjkA_{\cdot k}^{i}B_{\cdot j}^{k} corresponds to BTATB^{T}A^{T}.
For another example, consider the equation
Cji=AkiBjk.(19.48)C_{\cdot j}^{i}=A_{\cdot k}^{i}B_{j}^{\cdot k}.\tag{19.48}
This contraction takes place on the second index of AkiA_{\cdot k}^{i} and the second index of BjkB_{j}^{\cdot k}. The entries in the matrices AA and BB involved in the contraction for i=2i=2 and j=3j=3 are illustrated in the following figure:
[]       [][].(19.49)\left[ \begin{array} {cccc} \square & \square & \square & \square\\ \square & \square & \blacksquare & \square\\ \square & \square & \square & \square\\ \square & \square & \square & \square \end{array} \right] \ \ \ \ \ \ \ \left[ \begin{array} {cccc} \square & \square & \square & \square\\ \blacksquare & \blacksquare & \blacksquare & \blacksquare\\ \square & \square & \square & \square\\ \square & \square & \square & \square \end{array} \right] \left[ \begin{array} {cccc} \square & \square & \square & \square\\ \square & \square & \square & \square\\ \blacksquare & \blacksquare & \blacksquare & \blacksquare\\ \square & \square & \square & \square \end{array} \right] .\tag{19.49}
Clearly, this arrangement does not correspond to a valid matrix product. In order to invoke matrix multiplication, the matrix BB needs to be transposed. Thus,
Cji=AkiBjk(19.48)C_{\cdot j}^{i}=A_{\cdot k}^{i}B_{j}^{\cdot k} \tag{19.48}
corresponds to
C=ABT.(19.50)C=AB^{T}.\tag{19.50}
Finally, consider the identity
Cji=AkiBjk.(19.51)C_{j}^{\cdot i}=A_{k}^{\cdot i}B_{\cdot j}^{k}.\tag{19.51}
The summation on the right takes place on the first index of AkiA_{k}^{\cdot i} and the first index BjkB_{\cdot j}^{k}, which suggests the combination ATBA^{T}B. Additionally, as indicated by the symbol CjiC_{j}^{\cdot i}, the result of the contraction must be arranged in such a way that jj is the first index and ii is the second. Thus, the matrix form of the identity must use CTC^{T} instead of CC. In summary, the above relationship corresponds to the matrix identity
CT=ATB(19.52)C^{T}=A^{T}B\tag{19.52}
or, equivalently,
C=BTA.(19.53)C=B^{T}A.\tag{19.53}
Let us now express the transformation rules for second-order tensors in matrix form. Suppose that TijT^{ij}, TjiT_{\cdot j}^{i}, and TijT_{ij} are tensors of the type indicated by their indicial signatures, i.e.
Tij=TijJiiJjj          (19.54)Tji=TjiJiiJjj          (19.55)Tij=TijJiiJjj.          (19.56)\begin{aligned}T^{i^{\prime}j^{\prime}} & =T^{ij}J_{i}^{i^{\prime}}J_{j}^{j^{\prime}}\ \ \ \ \ \ \ \ \ \ \left(19.54\right)\\T_{\cdot j^{\prime}}^{i^{\prime}} & =T_{\cdot j}^{i}J_{i}^{i^{\prime} }J_{j^{\prime}}^{j}\ \ \ \ \ \ \ \ \ \ \left(19.55\right)\\T_{i^{\prime}j^{\prime}} & =T_{ij}J_{i^{\prime}}^{i}J_{j^{\prime}}^{j}.\ \ \ \ \ \ \ \ \ \ \left(19.56\right)\end{aligned}
If the symbols TT and TT^{\prime} represent the matrices corresponding to the alternative manifestations of each of the three tensors, then the transformation rules in the matrix notation read
T=J1T(J1)T          (19.57)T=J1TJ          (19.58)T=JTTJ.          (19.59)\begin{aligned}T^{\prime} & =J^{-1}T\left( J^{-1}\right) ^{T}\ \ \ \ \ \ \ \ \ \ \left(19.57\right)\\T^{\prime} & =J^{-1}TJ\ \ \ \ \ \ \ \ \ \ \left(19.58\right)\\T^{\prime} & =J^{T}TJ.\ \ \ \ \ \ \ \ \ \ \left(19.59\right)\end{aligned}
Two systems AijA_{ij} and BijB_{ij}, related by the identity
Bij=Aji,(19.60)B_{ij}=A_{ji},\tag{19.60}
correspond to matrices that are the transposes of each other. For example, if
Aij corresponds to [147258369],(19.61)A_{ij}\text{ corresponds to }\left[ \begin{array} {ccc} 1 & 4 & 7\\ 2 & 5 & 8\\ 3 & 6 & 9 \end{array} \right] ,\tag{19.61}
then
Bij corresponds to [123456789].(19.62)B_{ij}\text{ corresponds to }\left[ \begin{array} {ccc} 1 & 2 & 3\\ 4 & 5 & 6\\ 7 & 8 & 9 \end{array} \right] .\tag{19.62}
The best way to convince yourself of this relationship is to recall the technique of unpacking described in Section 7.4. For example, B32B_{32} equals A23A_{23} which equals 88. In other words, the entry in the 3rd3^{\text{rd}} row and 2nd2^{\text{nd}} column of the matrix corresponding to BijB_{ij} is 88 -- the same as the entry in the 2nd2^{\text{nd}} row and 3rd3^{\text{rd}} column of the matrix corresponding to AijA_{ij}. Thus, the two matrices are, indeed, the transposes of each other.
The same statement holds for systems AijA^{ij} and BijB^{ij} with superscripts. If
Bij=Aji(19.63)B^{ij}=A^{ji}\tag{19.63}
then the two systems correspond to matrices that are the transposes of each other.
Similarly, two mixed systems AjiA_{\cdot j}^{i} and BijB_{i}^{\cdot j} related by the identity
Bij=Aij(19.64)B_{i}^{\cdot j}=A_{\cdot i}^{j}\tag{19.64}
correspond to matrices that are the transposes of each other. It is noteworthy, however, that in order for this relationship to hold, the order in which the superscript and the subscript appear in the two systems must be reversed.
A system AijA_{ij} is called symmetric if it satisfies the identity
Aij=Aji.(19.65)A_{ij}=A_{ji}.\tag{19.65}
Note that a symmetric system AijA_{ij} corresponds to a symmetric matrix. Recall that a matrix AA is symmetric if it equals its transpose, i.e.
AT=A.(19.66)A^{T}=A.\tag{19.66}
The term symmetric makes sense since the values of the entries exhibits a mirror symmetry with respect to the main diagonal, e.g.
[510142023].(19.67)\left[ \begin{array} {rrr} 5 & 1 & 0\\ 1 & 4 & 2\\ 0 & 2 & 3 \end{array} \right] .\tag{19.67}
Similarly, a system AijA^{ij} with two superscripts is called symmetric if
Aij=Aji.(19.68)A^{ij}=A^{ji}.\tag{19.68}
Such a system, too, corresponds to a symmetric matrix.
Interestingly, the concept of symmetry appears to be problematic for a mixed system AjiA_{\cdot j}^{i}. The would-be definition of symmetry
Aji=Aij(-)A_{\cdot j}^{i}=A_{\cdot i}^{j} \tag{-}
is invalid on the notational level. This should give us serious pause. Over the course of our narrative, we have learned to trust the intimation given to us by the tensor notation. We must, therefore, accept the fact that the concept of symmetry cannot be applied to a mixed system in the above form. If we ignore what the notation is telling us, and call a mixed system symmetric if it corresponds to a symmetric matrix, we will run into contradictions later on. Specifically, suppose that a "symmetric" AjiA_{\cdot j}^{i} is the manifestation of a tensor in some coordinate system or, to use the language of Linear Algebra, the manifestation of a tensor with respect to a particular basis. Then AA transforms under a change of coordinates (or a change of basis) according to the rule
Aji=AjiJiiJjj.(19.69)A_{\cdot j^{\prime}}^{i^{\prime}}=A_{\cdot j}^{i}J_{i}^{i^{\prime} }J_{j^{\prime}}^{j}.\tag{19.69}
Thus, if AjiA_{\cdot j}^{i} corresponds to a symmetric matrix, then AjiA_{\cdot j^{\prime}}^{i^{\prime}} will most likely not. For a simple illustration, note that for a symmetric matrix
[1552](19.70)\left[ \begin{array} {rr} 1 & 5\\ 5 & -2 \end{array} \right]\tag{19.70}
the combination
[2132][1552][2132]1(19.71)\left[ \begin{array} {cc} 2 & 1\\ 3 & 2 \end{array} \right] \left[ \begin{array} {rr} 1 & 5\\ 5 & -2 \end{array} \right] \left[ \begin{array} {cc} 2 & 1\\ 3 & 2 \end{array} \right] ^{-1}\tag{19.71}
equals
[10979],(19.72)\left[ \begin{array} {cc} -10 & 9\\ -7 & 9 \end{array} \right] ,\tag{19.72}
which is no longer symmetric. Thus, when a mixed system AjiA_{\cdot j}^{i} corresponds to a symmetric matrix, it is as much a characterization of the particular coordinate system (or basis) as it is of AjiA_{\cdot j}^{i} itself. As we mentioned earlier, the Kronecker delta δji\delta_{j}^{i} is the sole exception to this rule because it corresponds to the identity matrix in all coordinate systems.
Fortunately, the tensor notation not only warns us of a potential problem, but also presents us with a clear path forward. Namely, no rules of the tensor notation would be violated if we called a system AjiA_{\cdot j}^{i} symmetric when
Aji=Aji.(19.73)A_{\cdot j}^{i}=A_{j}^{\cdot i}.\tag{19.73}
This works on the notational level, but what is the system AjiA_{j}^{\cdot i} and how is it related to AjiA_{\cdot j}^{i}? The tensor framework provides the answer to this question, as well: AjiA_{j}^{\cdot i} is AjiA_{\cdot j}^{i} with a lowered first index and a raised second index. In other words,
Aji=AsrMrjMsi,(19.74)A_{j}^{\cdot i}=A_{\cdot s}^{r}M_{rj}M^{si},\tag{19.74}
where we are using the inner product matrix (or, in the language of tensors, the metric tensor) MM for juggling indices. This enables us to define what it means for a mixed system AjiA_{\cdot j}^{i} to be symmetric, albeit not in isolation but in the context of an inner product. In other words, in the context of the associated Euclidean space framework.
Specifically, a mixed system AjiA_{\cdot j}^{i} is said to be symmetric if
Aji=Aji,(19.73)A_{\cdot j}^{i}=A_{j}^{\cdot i}, \tag{19.73}
where AjiA_{j}^{\cdot i} is the result of index juggling on AjiA_{\cdot j}^{i}.
It is worth reiterating that this new definition of symmetry does not correspond to the concept of symmetry as applied to matrices. In particular, the matrix corresponding to a system AjiA_{\cdot j}^{i} that satisfies the identity above, will not be symmetric, except in very special circumstances. Also note that the identity
Aji=Aji(19.73)A_{\cdot j}^{i}=A_{j}^{\cdot i} \tag{19.73}
indicates that the matrices corresponding to AjiA_{\cdot j}^{i} and AjiA_{j}^{\cdot i} are not equal, but rather the transposes of each other. Note, however, that for a system AjiA_{\cdot j}^{i} that satisfies this identity, the placeholder may be safely omitted, i.e. the symbol AjiA_{\cdot j}^{i} can be replaced with AjiA_{j}^{i}. After all, the above identity tells us that it does not matter whether AjiA_{j}^{i} is interpreted as AjiA_{\cdot j}^{i} or AjiA_{j}^{\cdot i}.
Also be reminded that, unlike systems with two subscripts or two superscripts, the concept of symmetry, as applied to mixed systems, requires the availability of an inner product, i.e. the context of a Euclidean space. To make the above definition more explicit, we could have written it in the form
Aji=AsrMrjMsi,(19.75)A_{\cdot j}^{i}=A_{\cdot s}^{r}M_{rj}M^{si},\tag{19.75}
although this form seems to obfuscate, rather than clarify, the matter. However, this form does represent an explicit recipe for testing whether a system AjiA_{\cdot j}^{i} is symmetric. For example, suppose that
Mij corresponds to [142].(19.76)M_{ij}\text{ corresponds to }\left[ \begin{array} {ccc} 1 & & \\ & 4 & \\ & & 2 \end{array} \right] .\tag{19.76}
We will now show that if the system
Aji corresponds to [21214345415272](19.77)A_{\cdot j}^{i}\text{ corresponds to }\left[ \begin{array} {ccc} 2 & 1 & 2\\ \frac{1}{4} & \frac{3}{4} & \frac{5}{4}\\ 1 & \frac{5}{2} & \frac{7}{2} \end{array} \right]\tag{19.77}
then it is symmetric. In order to apply the definition
Aji=AsrMrjMsi,(19.75)A_{\cdot j}^{i}=A_{\cdot s}^{r}M_{rj}M^{si}, \tag{19.75}
note that the combination AsrMrjMsiA_{\cdot s}^{r}M_{rj}M^{si} corresponds to MAM1MAM^{-1}, i.e.
[100040002][21214345415272][100040002]1=[21411345225472].(19.78)\left[ \begin{array} {ccc} 1 & 0 & 0\\ 0 & 4 & 0\\ 0 & 0 & 2 \end{array} \right] \left[ \begin{array} {ccc} 2 & 1 & 2\\ \frac{1}{4} & \frac{3}{4} & \frac{5}{4}\\ 1 & \frac{5}{2} & \frac{7}{2} \end{array} \right] \left[ \begin{array} {ccc} 1 & 0 & 0\\ 0 & 4 & 0\\ 0 & 0 & 2 \end{array} \right] ^{-1}=\left[ \begin{array} {ccc} 2 & \frac{1}{4} & 1\\ 1 & \frac{3}{4} & \frac{5}{2}\\ 2 & \frac{5}{4} & \frac{7}{2} \end{array} \right] .\tag{19.78}
The resulting matrix is the transpose of the matrix corresponding to AjiA_{\cdot j}^{i} proving that AjiA_{\cdot j}^{i} is symmetric.
The symmetry criterion can be simplified by applying a single index juggling step to the equation
Aji=Aji.(19.73)A_{\cdot j}^{i}=A_{j}^{\cdot i}. \tag{19.73}
Lowering the index ii on both sides, we find
Aij=Aji,(19.65)A_{ij}=A_{ji}, \tag{19.65}
Thus, AjiA_{\cdot j}^{i} is symmetric if the related system AijA_{ij} is symmetric. This is a simpler criterion since a system with two superscripts is symmetric if it corresponds to a symmetric matrix. To illustrate this criterion with the example above, note that Aij=MikAjkA_{ij}=M_{ik}A_{\cdot j}^{k} which corresponds to the matrix product MAMA, i.e.
Aij corresponds to [100040002][21214345415272]=[212135257].(19.79)A_{ij}\text{ corresponds to }\left[ \begin{array} {ccc} 1 & 0 & 0\\ 0 & 4 & 0\\ 0 & 0 & 2 \end{array} \right] \left[ \begin{array} {ccc} 2 & 1 & 2\\ \frac{1}{4} & \frac{3}{4} & \frac{5}{4}\\ 1 & \frac{5}{2} & \frac{7}{2} \end{array} \right] =\left[ \begin{array} {ccc} 2 & 1 & 2\\ 1 & 3 & 5\\ 2 & 5 & 7 \end{array} \right] .\tag{19.79}
Since the resulting matrix is symmetric, AijA_{ij} is symmetric and therefore AjiA_{\cdot j}^{i} is symmetric.
The concept of a linear transformation is one of the most fundamental elements in Linear Algebra. Recall that a linear transformation TT can be represented by a matrix AA in the component space. In this regard, linear transformations and inner products are similar in that both are represented by matrices in the component space: the matrix AA in the case of linear transformations and the matrix MM in the case of inner products. However, this similitude appears stronger than it really is and the tensor notation alerts us to a fundamental difference between the two matrices.
A linear transformation TT maps one vector to another. The original vector is known as the preimage and the result of the transformation is known as the image. Denote the preimage by x\mathbf{x} and its image by y\mathbf{y}, i.e.
y=T(x).(19.80)\mathbf{y}=T\left( \mathbf{x}\right) .\tag{19.80}
The role of the matrix AA is to convert the components xix^{i} of the preimage into the components yjy^{j} of the image. Thus, in the tensor notation, the matrix AA naturally corresponds to a mixed system AjiA_{\cdot j}^{i} which enables us to write
yi=Ajixj  .(19.81)y^{i}=A_{\cdot j}^{i}x^{j}\ \ .\tag{19.81}
An inner product, on the other hand, takes two vectors x\mathbf{x} and y\mathbf{y} as inputs and produces a scalar. This is why the matrix MM naturally appears with two subscripts, so as to enable the combination
Mijxiyj(19.82)M_{ij}x^{i}y^{j}\tag{19.82}
that results in (x,y)\left( \mathbf{x,y}\right) .
Thus, under a change of basis, the matrices AA and MM transform by different rules. By the quotient theorem, discussed in Chapter 14, the variant AjiA_{\cdot j}^{i} is a tensor of the type indicated by its indicial signature and therefore transforms according to
Aji=AjiJiiJjj.(19.83)A_{\cdot j^{\prime}}^{i^{\prime}}=A_{\cdot j}^{i}J_{i}^{i^{\prime} }J_{j^{\prime}}^{j}.\tag{19.83}
By the same theorem, the variant MijM_{ij} is a tensor of the type indicated by its indicial signature and therefore transforms according to
Mij=MijJiiJjj.(19.84)M_{i^{\prime}j^{\prime}}=M_{ij}J_{i^{\prime}}^{i}J_{j^{\prime}}^{j}.\tag{19.84}
If AA^{\prime} is the matrix representing the linear transformation TT in the alternative basis bi\mathbf{b}_{i^{\prime}}, then the equation
Aji=AjiJiiJjj(19.83)A_{\cdot j^{\prime}}^{i^{\prime}}=A_{\cdot j}^{i}J_{i}^{i^{\prime} }J_{j^{\prime}}^{j} \tag{19.83}
tells us that
A=J1AJ,(19.85)A^{\prime}=J^{-1}AJ,\tag{19.85}
where JJ denotes the matrix corresponding to the Jacobian JjjJ_{j^{\prime}} ^{j}. This formula can be found in equation (11)\left( 11\right) of Chapter 22 of I.M. Gelfand's Lectures on Linear Algebra.
Similarly, if MM^{\prime} represents the same inner product in the alternative basis bi\mathbf{b}_{i^{\prime}}, then the equation
Mij=MijJiiJjj(19.84)M_{i^{\prime}j^{\prime}}=M_{ij}J_{i^{\prime}}^{i}J_{j^{\prime}}^{j} \tag{19.84}
tells us that
M=JTMJ.(19.86)M^{\prime}=J^{T}MJ.\tag{19.86}
This formula can be found in equation (7)\left( 7\right) of Chapter 11 of the same book.
Thus, it may be said that the matrix notation blurs the distinction between the component space representations of inner products -- or, more generally, bilinear forms -- and linear transformations since both concepts are represented by matrices. In fact, this similitude creates the temptation to consider a one-to-one correspondence between bilinear forms and linear transformation. Such a correspondence indeed exists but, in order to avoid internal contradictions, it needs to be carefully framed. The tensor notation can be instrumental in helping us establish the right correspondence -- the task to which we now turn.
An inner product is a special case of a more general operation known as a bilinear form B(x,y)B\left( \mathbf{x},\mathbf{y}\right) defined as an operation that takes two vectors and produces a number subject only to linearity in each argument, i.e.
B(x,αy+βz)=αB(x,y)+βB(x,z)(19.87)B\left( \mathbf{x},\alpha\mathbf{y}+\beta\mathbf{z}\right) =\alpha B\left( \mathbf{x},\mathbf{y}\right) +\beta B\left( \mathbf{x},\mathbf{z}\right)\tag{19.87}
and
B(αx+βy,z)=αB(x,z)+βB(y,z).(19.88)B\left( \alpha\mathbf{x}+\beta\mathbf{y},\mathbf{z}\right) =\alpha B\left( \mathbf{x},\mathbf{z}\right) +\beta B\left( \mathbf{y},\mathbf{z}\right) .\tag{19.88}
Thus, an inner product is a symmetric positive definite bilinear form, where the term symmetric refers to the commutative property. Rather than introduce a special letter to distinguish inner products from general bilinear forms, Linear Algebra simply drops the letter altogether for inner products resulting in the symbol (x,y)\left( \mathbf{x},\mathbf{y}\right) .
Much like inner products, bilinear forms are represented by matrices in the component space. If x=xibi\mathbf{x}=x^{i}\mathbf{b}_{i} and y=yibi\mathbf{y} =y^{i}\mathbf{b}_{i}, then, by linearity,
B(x,y)=B(xibi,yjbj)=B(bi,bj)xiyj.(19.89)B\left( \mathbf{x},\mathbf{y}\right) =B\left( x^{i}\mathbf{b}_{i} ,y^{j}\mathbf{b}_{j}\right) =B\left( \mathbf{b}_{i},\mathbf{b}_{j}\right) x^{i}y^{j}.\tag{19.89}
Thus, if the matrix BB with entries BijB_{ij} is defined by
Bij=B(bi,bj)(19.90)B_{ij}=B\left( \mathbf{b}_{i},\mathbf{b}_{j}\right)\tag{19.90}
then
B(x,y)=Bijxiyj(19.91)B\left( \mathbf{x},\mathbf{y}\right) =B_{ij}x^{i}y^{j}\tag{19.91}
or, in the language of matrices,
B(x,y)=xTBy.(19.92)B\left( \mathbf{x},\mathbf{y}\right) =x^{T}By.\tag{19.92}
Naturally, BB transforms from under a change of basis by the rule
B=JTBJ.(19.93)B=J^{T}BJ.\tag{19.93}
As we have already discussed, the fact that both linear transformations and bilinear forms are represented by matrices in the component space creates the temptation to propose a one-to-one correspondence between linear transformations and bilinear forms. The naive way of doing this is to state that a linear transformation TT corresponds to a bilinear form BB if the two are represented by the same matrix. However, this approach is clearly self-contradictory since matrices representing linear transformations and bilinear forms transform differently under a change of basis. As a result, a linear transformation and a bilinear form that happen to be represented by the same matrix with respect to a particular basis, will most certainly be represented by two different matrices with respect to another basis.
As it is often the case, the remedy is found by going back to the Euclidean framework where one can develop the idea in a geometric setting. For a linear transformation TT, define the corresponding bilinear form BB by the equation
B(x,y)=(x,T(y)).(19.94)B\left( \mathbf{x},\mathbf{y}\right) =\left( \mathbf{x},T\left( \mathbf{y}\right) \right) .\tag{19.94}
In other words, B(x,y)B\left( \mathbf{x},\mathbf{y}\right) is the inner product between x\mathbf{x} and T(y)T\left( \mathbf{y}\right) .
Let us now use the tensor approach to determine the relationship between the matrix BB representing the bilinear form with respect to a basis bi\mathbf{b}_{i} and the matrix AA representing the linear transformation with respect to the same basis. We have
B(x,y)=Bijxiyj,(19.95)B\left( \mathbf{x},\mathbf{y}\right) =B_{ij}x^{i}y^{j},\tag{19.95}
while
T(y)=Ajiyjbi.(19.96)T\left( \mathbf{y}\right) =A_{\cdot j}^{i}y^{j}\mathbf{b}_{i}.\tag{19.96}
Therefore,
(x,T(y))=MikAjixkyj.(19.97)\left( \mathbf{x},T\left( \mathbf{y}\right) \right) =M_{ik}A_{\cdot j} ^{i}x^{k}y^{j}.\tag{19.97}
Switch the names of the indices ii and kk so that xkx^{k} appears as xix^{i} , i.e.
(x,T(y))=MkiAjkxiyj.(19.98)\left( \mathbf{x},T\left( \mathbf{y}\right) \right) =M_{ki}A_{\cdot j} ^{k}x^{i}y^{j}.\tag{19.98}
Since, by construction, B(x,y)B\left( \mathbf{x},\mathbf{y}\right) equals (x,T(y))\left( \mathbf{x},T\left( \mathbf{y}\right) \right) for all x\mathbf{x} and y\mathbf{y}, we conclude that
Bijxiyj=MkiAjkxiyj(19.99)B_{ij}x^{i}y^{j}=M_{ki}A_{\cdot j}^{k}x^{i}y^{j}\tag{19.99}
for all xix^{i} and yjy^{j}. Therefore,
Bij=MkiAjk ,(19.100)B_{ij}=M_{ki}A_{\cdot j}^{k}\ ,\tag{19.100}
which is precisely the relationship we set out to determine. Invoking index jugging by the "metric tensor" MkiM_{ki}, we arrive at the simple relationship
Bij=Aij.(19.101)B_{ij}=A_{ij}.\tag{19.101}
Thus, somewhat ironically, the naive approach of associating a linear transformation with the bilinear form represented by the same matrix is not too far off. The only adjustment that one needs to make is to require -- to use tensor terminology -- the lowering of the superscript on the matrix representing the linear transformation.
A linear transformation TT is called self-adjoint if the identity
(x,T(y))=(T(x),y)(19.102)\left( \mathbf{x},T\left( \mathbf{y}\right) \right) =\left( T\left( \mathbf{x}\right) ,\mathbf{y}\right)\tag{19.102}
holds for all vectors x\mathbf{x} and y\mathbf{y}. In other words, the inner product of x\mathbf{x} and T(y)T\left( \mathbf{y}\right) is the same as that of T(x)T\left( \mathbf{x}\right) and y\mathbf{y}, i.e. it does not matter whether TT is applied to x\mathbf{x} or y\mathbf{y}. The concept of a self-adjoint matrix is interesting for its "reverse commute": while most ideas in Linear Algebra were developed for vectors and subsequently extended to the component space, the concept of a self-adjoint matrix clearly arose out of symmetric matrices and their special properties. But are self-adjoint linear transformations necessarily represented by symmetric matrices? We will now answer this question.
As we discovered in the previous Section, the inner product (x,T(y))\left( \mathbf{x},T\left( \mathbf{y}\right) \right) is given by
(x,T(y))=MkiAjkxiyj.(19.103)\left( \mathbf{x},T\left( \mathbf{y}\right) \right) =M_{ki}A_{\cdot j} ^{k}x^{i}y^{j}.\tag{19.103}
Similarly, (T(x),y)\left( T\left( \mathbf{x}\right) ,\mathbf{y}\right) is given by
(T(x),y)=MkjAikxiyj.(19.104)\left( T\left( \mathbf{x}\right) ,\mathbf{y}\right) =M_{kj}A_{\cdot i} ^{k}x^{i}y^{j}.\tag{19.104}
Since, for a self-adjoint TT, (x,T(y))\left( \mathbf{x},T\left( \mathbf{y}\right) \right) equals (T(x),y)\left( T\left( \mathbf{x}\right) ,\mathbf{y}\right) for all x\mathbf{x} and y\mathbf{y}, we conclude that
MkiAjkxiyj=MkjAikxiyj(19.105)M_{ki}A_{\cdot j}^{k}x^{i}y^{j}=M_{kj}A_{\cdot i}^{k}x^{i}y^{j}\tag{19.105}
for all xix^{i} and yiy^{i}. Therefore
MkiAjk=MkjAik,(19.106)M_{ki}A_{\cdot j}^{k}=M_{kj}A_{\cdot i}^{k},\tag{19.106}
which is precisely the characterization of the matrix AA we were looking for. In other words, the linear transformation TT is self-adjoint if it is represented by a AjiA_{\cdot j}^{i} that is symmetric in the sense of the definition given in Section 19.5. To see this, allow the inner product matrix MM (i.e. the "metric tensor") \ to lower the superscript kk on both sides, leading to the more recognizable form
Aij=Aji.(19.107)A_{ij}=A_{ji}.\tag{19.107}
Raising ii on both sides also yields
Aji=Aji.(19.108)A_{\cdot j}^{i}=A_{j}^{\cdot i}.\tag{19.108}
As we have already pointed out, in the language of matrices, the criterion for a matrix representing a self-adjoint transformation reads
A=MATM1.(19.109)A=MA^{T}M^{-1}.\tag{19.109}
Exercise 19.1Show that the tensor equation
Cjk=AjiBik(19.110)C_{j}^{\cdot k}=A_{\cdot j}^{i}B_{\cdot i}^{k}\tag{19.110}
corresponds to the matrix equation
CT=BA(19.111)C^{T}=BA\tag{19.111}
or, equivalently,
C=ATBT.(19.112)C=A^{T}B^{T}.\tag{19.112}
Exercise 19.2Show that the tensor equation
Cki=AijBkj(19.113)C_{ki}=A_{ij}B_{\cdot k}^{j}\tag{19.113}
corresponds to the matrix equation
CT=AB.(19.114)C^{T}=AB.\tag{19.114}
Exercise 19.3Show that, for a symmetric matrix AA and any matrix XX, the combination
XAXT(19.115)XAX^{T}\tag{19.115}
is symmetric. Thus, if a tensor AijA_{ij} is symmetric in a component space corresponding to one basis, it is symmetric in a component space corresponding to any basis.
Exercise 19.4If the component space basis bi\mathbf{b}_{i} is orthonormal, show that a symmetric system AjiA_{\cdot j}^{i} is represented by a symmetric matrix.
Send feedback to Pavel