Coordinate Systems in a Euclidean Space

A coordinate system is a method for enumerating points in a Euclidean space by numbers. In order for the coordinate system to be reasonably regular, the number of coordinates must match the dimension of the Euclidean space, i.e. three coordinates in a three-dimensional space, two coordinates on a plane, and one coordinate on a straight line. Furthermore, the correspondence between the points and the coordinates must be reasonably smooth. More precisely, the position vector R\mathbf{R} should be a sufficiently differentiable function of the coordinates. Other than this requirement, a coordinate system may be completely arbitrary.
For a general coordinate system, the coordinates will be denoted by the capital letter ZZ with a superscript, i.e. Z1,Z2,Z3Z^{1},Z^{2},Z^{3} or, collectively, ZiZ^{i}. When indicating the coordinates of a particular point, we will put the coordinates in parenthesis, i.e. (Z1,Z2,Z3)\left( Z^{1},Z^{2} ,Z^{3}\right) . The unusual placement of the index as a superscript is a crucial element of the tensor notation which is the bedrock of Tensor Calculus. Generally speaking, the term tensor notation refers to the use of indices, both as superscripts and subscripts, to enumerate sets of related objects. Its most basic elements will be described in Chapter 7. Many of its other important elements, such as index juggling, will emerge in later chapters.
The use of a superscript for enumerating coordinates is a completely arbitrary choice, and we could have just as well chosen to use a subscript. However, Tensor Calculus has strict rules for coordinating the placements of indices. Once we have chosen to use a superscript for coordinates, the placement of all other indices is uniquely determined.
Obviously, there are an unlimited number of ways to impose a coordinate system upon a Euclidean space. There are a handful of well-known families of coordinate systems that are frequently used for analyzing problems with special geometries. The most common coordinate systems in a three-dimensional space are Cartesian or, more generally, affine coordinates denoted by x,y,zx,y,z, cylindrical coordinates denoted by r,θ,zr,\theta,z, and spherical coordinates denoted by r,θ,φr,\theta,\mathbf{\varphi}. In two dimensions, the most common coordinate systems are once again Cartesian or affine coordinates denoted by x,yx,y and polar coordinates denoted by r,θr,\theta.
In two dimensions, a coordinate system can be represented graphically by its coordinate lines, i.e. curves that consist of points that correspond to a fixed value of one variable while the other is allowed to vary. The following figure illustrates the coordinate lines for a generic coordinate system Z1,Z2Z^{1},Z^{2} in the plane.
(6.1)
In three dimensions, coordinate lines are replaced by coordinate surfaces, i.e. surfaces that correspond to a fixed value of one variable while the other two are allowed to vary. We will use this method of illustrating coordinate systems for cylindrical and spherical coordinates later in this Chapter.
In the early chapters, we discovered the impressive utility of geometric vectors when treated as pure geometric objects. However, we also observed the serious limitations of pure geometric methods. Most of these limitations are removed by the use of coordinate systems. We will now begin to explore the remarkable power of analytical methods that leverage the utility of coordinate systems.
For a simple but effective demonstration, let us revisit the problem of differentiating the vector-valued function R(γ)\mathbf{R}\left( \gamma\right) that corresponds to the unit circle as γ\gamma changes from 00 to 2π2\pi.
(6.2)
In Chapter 4, we solved this problem by a geometric analysis in a coordinate-free setting. Our solution was intuitive, insightful, and intellectually satisfying. On the other hand, our argument was lengthy and is, in practice, applicable only to very simple problems. Just imagine an R(γ)\mathbf{R}\left( \gamma\right) that traces out an ellipse instead of a circle -- the problem instantly becomes worthy of an eighteenth-century graduate thesis. With the help of coordinates, the circle and the ellipse are equally simple and the solution is quicker, more straightforward and more powerful compared to the coordinate-free approach.
Introduce a Cartesian coordinate system x,yx,y with the origin at the center of the circle. Let the unit vectors pointing in the direction of the coordinate axes be denoted by i\mathbf{i} and j\mathbf{j}.
(6.3)
Then R(γ)\mathbf{R}\left( \gamma\right) is given by the equation
R(γ)=icosγ+jsinγ(6.4)\mathbf{R}\left( \gamma\right) =\mathbf{i}\cos\gamma+\mathbf{j}\sin\gamma\tag{6.4}
for which differentiation with respect to γ\gamma readily yields
R(γ)=isinγ+jcosγ.(6.5)\mathbf{R}^{\prime}\left( \gamma\right) =-\mathbf{i}\sin\gamma +\mathbf{j}\cos\gamma.\tag{6.5}
The resulting analytical expression for R(γ)\mathbf{R}^{\prime}\left( \gamma\right) can now be interpreted geometrically. The following figure shows R(γ)\mathbf{R}^{\prime}\left( \gamma\right) placed at the tip of R(γ)\mathbf{R}\left( \gamma\right) .
(6.6)
It is clear that the vector R(γ)\mathbf{R}^{\prime}\left( \gamma\right) is a unit vector orthogonal to R(γ)\mathbf{R}\left( \gamma\right) . This can also be verified by evaluating the dot products R(γ)R(γ)\mathbf{R}^{\prime}\left( \gamma\right) \cdot\mathbf{R}^{\prime}\left( \gamma\right) and R(γ)R(γ)\mathbf{R}\left( \gamma\right) \cdot\mathbf{R}^{\prime}\left( \gamma\right) . Note that the inner product matrix with respect to the basis i,j\mathbf{i},\mathbf{j} is the identity matrix, i.e.
[iiijijjj]=[1001],(6.7)\left[ \begin{array} {cc} \mathbf{i}\cdot\mathbf{i} & \mathbf{i}\cdot\mathbf{j}\\ \mathbf{i}\cdot\mathbf{j} & \mathbf{j}\cdot\mathbf{j} \end{array} \right] =\left[ \begin{array} {cc} 1 & 0\\ 0 & 1 \end{array} \right] ,\tag{6.7}
and, therefore, according to the formula for evaluating dot products in the component space derived in Section 2.6,
R(γ)R(γ)=[sinγcosγ][1001][sinγcosγ]=1(6.8)\mathbf{R}^{\prime}\left( \gamma\right) \cdot\mathbf{R}^{\prime}\left( \gamma\right) = \begin{array} {c} \left[ \begin{array} {cc} -\sin\gamma & \cos\gamma \end{array} \right] \\ \\ \end{array} \left[ \begin{array} {cc} 1 & 0\\ 0 & 1 \end{array} \right] \left[ \begin{array} {r} -\sin\gamma\\ \cos\gamma \end{array} \right] =1\tag{6.8}
and
R(γ)R(γ)=[cosγsinγ][1001][sinγcosγ]=0(6.9)\mathbf{R}\left( \gamma\right) \cdot\mathbf{R}^{\prime}\left( \gamma\right) = \begin{array} {c} \left[ \begin{array} {cc} \cos\gamma & \sin\gamma \end{array} \right] \\ \\ \end{array} \left[ \begin{array} {cc} 1 & 0\\ 0 & 1 \end{array} \right] \left[ \begin{array} {r} -\sin\gamma\\ \cos\gamma \end{array} \right] =0\tag{6.9}
where the first identity confirms that R(γ)\mathbf{R}^{\prime}\left( \gamma\right) is unit length and the second confirms that it is orthogonal to R(γ)\mathbf{R}\left( \gamma\right) .
Consider the problem that appeared in Exercise 4.16 in Chapter 4. Given a point AA and a curve Γ\Gamma, show that for the point BB on Γ\Gamma that is closest to AA, the segment ABAB is orthogonal to Γ\Gamma .
(6.10)
The intended solution was as follows. Let R(γ)\mathbf{R}\left( \gamma\right) be the vector equation of the curve Γ\Gamma, where the origin for the position vector R\mathbf{R} is placed at AA. The problem then is to find the value γ0\gamma_{0} of γ\gamma that yields the shortest vector R(γ)\mathbf{R} \left( \gamma\right) .
(6.11)
Denote the objective function by F(γ)F\left( \gamma\right) , i.e.
F(γ)=R(γ)R(γ),(6.12)F\left( \gamma\right) =\mathbf{R}\left( \gamma\right) \cdot\mathbf{R} \left( \gamma\right) ,\tag{6.12}
where we neglected to take the square root of the right side since for a positive quantity, there is no difference between minimizing it or its square. Suppose that the minimum of F(γ)F\left( \gamma\right) occurs at γ=γ0\gamma=\gamma_{0}, i.e.
F(γ0)=0.(6.13)F^{\prime}\left( \gamma_{0}\right) =0.\tag{6.13}
By the dot product rule
(UV)=UV+UV,(6.14)\left( \mathbf{U}\cdot\mathbf{V}\right) ^{\prime}=\mathbf{U}^{\prime} \cdot\mathbf{V}+\mathbf{U}\cdot\mathbf{V}^{\prime},\tag{6.14}
we find that the derivative of F(γ)F^{\prime}\left( \gamma\right) is given by
F(γ)=2R(γ)R(γ).(6.15)F^{\prime}\left( \gamma\right) =2\mathbf{R}\left( \gamma\right) \cdot\mathbf{R}^{\prime}\left( \gamma\right) .\tag{6.15}
Equating F(γ)F^{\prime}\left( \gamma\right) to zero, we conclude that the critical value γ0\gamma_{0} is characterized by the equation
R(γ0)R(γ0)=0.(6.16)\mathbf{R}\left( \gamma_{0}\right) \cdot\mathbf{R}^{\prime}\left( \gamma_{0}\right) =0.\tag{6.16}
In other words, R(γ0)\mathbf{R}\left( \gamma_{0}\right) , which corresponds to the segment ABAB, is orthogonal to the tangent R(γ0)\mathbf{R}^{\prime}\left( \gamma_{0}\right) , as we set out to prove.
The great advantage of this approach is, of course, its geometric insight. By considering vectors themselves rather than their components, we never let go of the geometric meaning and, as a result, the final identity yielded itself to an immediate geometric interpretation. On the other hand, the great shortcoming of this approach is that, while it perfectly characterizes the solution in geometric terms, it does not provide a means of finding it for a specific geometric configuration, i.e. finding the specific point BB on a specific curve Γ\Gamma that is closest to a specific point AA.
Let us demonstrate the coordinate approach by attempting the same problem with the help of Cartesian coordinates. Suppose that the coordinates of the point AA are (x1,y1)\left( x_{1},y_{1}\right) and that the curve Γ\Gamma is given by the functions x(γ)x\left( \gamma\right) and y(γ)y\left( \gamma\right) .
(6.17)
Then the objective function F(γ)F\left( \gamma\right) is given by the equation
F(γ)=(x(γ)x1)2+(y(γ)y1)2.(6.18)F\left( \gamma\right) =\left( x\left( \gamma\right) -x_{1}\right) ^{2}+\left( y\left( \gamma\right) -y_{1}\right) ^{2}.\tag{6.18}
Its derivative F(γ)F^{\prime}\left( \gamma\right) is
F(γ)=2(x(γ)x1)x(γ)+2(y(γ)y1)y(γ).(6.19)F^{\prime}\left( \gamma\right) =2\left( x\left( \gamma\right) -x_{1}\right) x^{\prime}\left( \gamma\right) +2\left( y\left( \gamma\right) -y_{1}\right) y^{\prime}\left( \gamma\right) .\tag{6.19}
Equating F(γ0)F^{\prime}\left( \gamma_{0}\right) to 00, we obtain the desired algebraic equation for γ0\gamma_{0}, i.e.
(x(γ0)x1)x(γ0)+(y(γ0)y1)y(γ0)=0.(6.20)\left( x\left( \gamma_{0}\right) -x_{1}\right) x^{\prime}\left( \gamma_{0}\right) +\left( y\left( \gamma_{0}\right) -y_{1}\right) y^{\prime}\left( \gamma_{0}\right) =0.\tag{6.20}
The great advantage of this approach is, of course, the fact that, for a specific problem, it can identify the specific point of the curve Γ\Gamma that is closest to the point AA. For example, if Γ\Gamma is the parabola given by the equations
x(γ)=γ          (6.21)y(γ)=γ2,          (6.22)\begin{aligned}x\left( \gamma\right) & =\gamma\ \ \ \ \ \ \ \ \ \ \left(6.21\right)\\y\left( \gamma\right) & =\gamma^{2},\ \ \ \ \ \ \ \ \ \ \left(6.22\right)\end{aligned}
and the coordinates of AA are (3,1)\left( 3,1\right) ,
(6.23)
then the equation for γ0\gamma_{0} reads
(γ03)×1+(γ021)×(2γ0)=0(6.24)\left( \gamma_{0}-3\right) \times1+\left( \gamma_{0}^{2}-1\right) \times\left( 2\gamma_{0}\right) =0\tag{6.24}
or
2γ03γ03=0.(6.25)2\gamma_{0}^{3}-\gamma_{0}-3=0.\tag{6.25}
An approximate solution of this equation, γ01.289623901485060347262\gamma_{0}\approx 1.289623901485060347262, gives a precise location of the sought after point BB.
What, then, is the great disadvantage of this approach? It is this: neither the precise numerical answer for the specific problem, nor the more general equation
(x(γ0)x1)x(γ0)+(y(γ0)y1)y(γ0)=0(6.20)\left( x\left( \gamma_{0}\right) -x_{1}\right) x^{\prime}\left( \gamma_{0}\right) +\left( y\left( \gamma_{0}\right) -y_{1}\right) y^{\prime}\left( \gamma_{0}\right) =0 \tag{6.20}
yield the geometric insight that ABAB must be orthogonal to the curve. While it is true that an experienced eye may spot the dot-product structure in the equation above, keep in mind that this is one of the simplest problems one may encounter. In a more complicated situation, the geometric interpretation is likely to be irrevocably lost with the introduction of coordinates. This phenomenon is exemplified by Euler's minimal surface equation
r(z)r(z)r(z)21=0(1.2)r^{\prime\prime}\left( z\right) r\left( z\right) -r^{\prime}\left( z\right) ^{2}-1=0 \tag{1.2}
briefly discussed in Chapter 1, which did not yield the geometric insight that a minimal surface is characterized by zero mean curvature.
The last two examples have demonstrated both the great utility and the great peril of coordinate systems. The beauty of Tensor Calculus is in its remarkable ability to combine the geometric and the coordinate approaches in a way that extracts the full benefits of both.
In all likelihood, you are already familiar with the most common special coordinate systems described below. Nevertheless, I hope that you do not skip this discussion since it describes coordinates systems differently from most textbooks. The common approach of introducing a special coordinates is by relating it to an a priori Cartesian coordinate system. This approach is typified by the following figure from the Wikipedia article on spherical coordinates, where one notices the ever-present background Cartesian grid.
(6.26)
Subsequently, spherical coordinates r,θ,φr,\theta,\varphi are related to Cartesian coordinates x,y,zx,y,z by the equations
r=x2+y2+z2          (6.27)θ=arctan(x2+y2,z)          (6.28)φ=arctan(x,y)          (6.29)\begin{aligned}r & =\sqrt{x^{2}+y^{2}+z^{2}}\ \ \ \ \ \ \ \ \ \ \left(6.27\right)\\\theta & =\arctan\left( \sqrt{x^{2}+y^{2}},z\right)\ \ \ \ \ \ \ \ \ \ \left(6.28\right)\\\varphi & =\arctan\left( x,y\right)\ \ \ \ \ \ \ \ \ \ \left(6.29\right)\end{aligned}
as well as the (more elegant) inverse equations
x=rsinθcosφ          (6.30)y=rsinθsinφ          (6.31)z=rcosθ.          (6.32)\begin{aligned}x & =r\sin\theta\cos\varphi\ \ \ \ \ \ \ \ \ \ \left(6.30\right)\\y & =r\sin\theta\sin\varphi\ \ \ \ \ \ \ \ \ \ \left(6.31\right)\\z & =r\cos\theta.\ \ \ \ \ \ \ \ \ \ \left(6.32\right)\end{aligned}
This common approach violates the spirit of Tensor Calculus by arbitrarily singling out a single coordinate system -- in this case, the Cartesian coordinates x,y,zx,y,z. From the point of view of the geometric space, this approach is not only aesthetically and philosophically objectionable but is, in fact, logically flawed since it does not describe how the coordinates x,y,zx,y,z were introduced in the first place. As a result, the construction is, at its very outset, detached from the very Euclidean space that it is meant to describe. For example, one is not able to answer the question what is the distance between the points with Cartesian coordinates (0,0,0)\left( 0,0,0\right) and (1,2,2)\left( 1,2,2\right) ? If one answers 12+22+22=3\sqrt{1^{2}+2^{2}+2^{2}}=3, then it would seem that the presence of the coordinate system has imposed the concept of length upon the parent Euclidean space. This is contrary to our approach in which the relationship is logically reversed: the concept of length comes first as an inalienable characteristic of the Euclidean space. Thus, the better alternative, and one that is consistent with the spirit of Tensor Calculus, is to describe the coordinate system in absolute terms by referring to the inherent geometric characteristics of the Euclidean space. This will be our approach.
Let us start with Cartesian coordinates. Cartesian coordinates are, without a doubt, the most commonly used -- and misused -- coordinate systems. That said, they are indeed a natural choice in many situations and, in a number of ways, represent the most easy to use coordinates. Our initial discussion will focus on the two-dimensional plane, as it is easier to visualize than the three-dimensional space, but is still sufficiently rich to illustrate all of its most important characteristics.
Cartesian coordinates are easiest to describe in terms of the coordinate basis i,j\mathbf{i},\mathbf{j}. Choose an arbitrary origin OO and a pair of unit orthogonal vectors i\mathbf{i} and j\mathbf{j}. To reiterate, in order for the coordinate system to qualify as Cartesian, the vectors i\mathbf{i} and j\mathbf{j} must be a) orthogonal and b) of unit length. If one of the conditions is violated, the resulting coordinates are no longer Cartesian, but merely affine.
(6.33)
Given the origin OO and the pair of unit orthogonal vectors i\mathbf{i} and j\mathbf{j}, the Cartesian coordinates x,yx,y of a point PP are the components of the vector U\mathbf{U} from OO to PP with respect to i\mathbf{i} and j\mathbf{j}, i.e.
U=x i+y j.(6.34)\mathbf{U}=x~\mathbf{i}+y~\mathbf{j.}\tag{6.34}
The corresponding geometric construction is illustrated in the following figure.
(6.35)
The resulting coordinate lines corresponding to integer values of xx and yy form a regular square grid spaced by precisely one Euclidean unit.
(6.36)
Another common way of representing Cartesian coordinates is by drawing the coordinate axes. The xx-axis is a straight line that passes through the origin OO in the direction of the basis vector i\mathbf{i}. In other words, the xx-axis is the coordinate line that corresponds to y=0y=0. Similarly, the yy-axis is a straight line that passes through OO in the direction of the basis vector j\mathbf{j}, and is the coordinate line that corresponds to x=0x=0.
(6.37)
This representation is attractive since it is more uncluttered. For the rest of this Section, however, we will stick with the coordinate line representation for the sake of consistency with other special coordinate systems.
There are infinitely many Cartesian coordinate systems in the plane since we are free to choose any point for the origin OO and any orientation (in the sense of rotation) of the orthonormal basis vectors i\mathbf{i} and j\mathbf{j}. The following figure illustrates a different Cartesian coordinate system that differs from the one above in both the location of the origin OO and the orientation of i\mathbf{i} and j\mathbf{j}.
(6.38)
Finally, we also have the choice of orientation (in the sense of Section 3.1) of the basis i,j\mathbf{i,j}. If the vectors i\mathbf{i} and j\mathbf{j} form a positively oriented set, then the coordinate system is said to be positively oriented or right-handed. Otherwise, it is negatively oriented or left-handed. The following figure shows a left-handed Cartesian coordinate system.
(6.39)
As we have already mentioned, the requirement that i\mathbf{i} and j\mathbf{j} are unit vectors is essential to the definition of Cartesian coordinates. Even if i\mathbf{i} and j\mathbf{j} are orthogonal and have equal but non-unit lengths, the resulting system can no longer be considered Cartesian. For example, the coordinate system illustrated in the following figure (where the reference segment on the bottom right has unit length) is not Cartesian, even though its coordinate lines form a regular square grid.
(6.40)
Note that without the reference segment, there would have been no way of determining whether the system is Cartesian.
In three dimensions, a Cartesian coordinate system is constructed by selecting an arbitrary origin OO and a set of three pairwise-orthogonal unit vectors i\mathbf{i}, j\mathbf{j}, and k\mathbf{k}.
(6.41)
Echoing the two-dimensional case, the Cartesian coordinates x,y,zx,y,z of a point PP are the components of the vector U\mathbf{U} from OO to PP with respect to i\mathbf{i}, j\mathbf{j}, and k\mathbf{k}, i.e.
U=x i+y j+z k.(6.42)\mathbf{U}=x~\mathbf{i}+y~\mathbf{j}+z\mathbf{~k.}\tag{6.42}
The resulting coordinate lines corresponding to integer values form a regular square grid spaced by precisely one Cartesian unit.
(6.43)
The coordinate system is right-handed or positively oriented if the set i,j,k\mathbf{i},\mathbf{j},\mathbf{k} is positively oriented. Otherwise, it is left-handed or negatively oriented.
Affine or rectilinear coordinates are a generalization of Cartesian coordinates without the constraints of orthogonality and unit length. Affine coordinates are constructed in the exact same way as Cartesian coordinates from an arbitrary linearly independent set of vectors i\mathbf{i} , j\mathbf{j}, and k\mathbf{k}.
(6.44)
Once again, the affine coordinates x,y,zx,y,z of a point PP are the components of the vector U\mathbf{U} from OO to PP with respect to the vectors i\mathbf{i}, j\mathbf{j}, and k\mathbf{k}, i.e.
U=x i+y j+z k.(6.45)\mathbf{U}=x~\mathbf{i}+y~\mathbf{j}+z\mathbf{~k.}\tag{6.45}
The resulting coordinate lines corresponding to integer values form a skewed regular parallelepiped grid, as illustrated in the following figure.
(6.46)
The term rectilinear refers to the straightness of the coordinate lines. Non-affine coordinate systems are known as curvilinear.
The concept of orientation applies to affine coordinates just as well as Cartesian. An affine coordinate system is said to be positively oriented or right-handed if the set of vectors i,j,k\mathbf{i} ,\mathbf{j},\mathbf{k} is positively oriented. Otherwise, it is negatively oriented or left-handed.
Any two affine coordinate systems are related by a combination of a linear transformation and a shift. Suppose that x,y,zx,y,z and x,y,zx^{\prime},y^{\prime },z^{\prime} are two sets of affine coordinates corresponding to the respective origins at OO and OO^{\prime} and the coordinate bases i,j,k\mathbf{i,j,k} and i,j,k\mathbf{i}^{\prime},\mathbf{j}^{\prime},\mathbf{k} ^{\prime}. Then the coordinates x,y,zx,y,z and x,y,zx^{\prime},y^{\prime},z^{\prime } are related by
[xyz]=[A11A12A13A21A22A23A31A32A33][xyz]+[x0y0z0],(6.47)\left[ \begin{array} {c} x^{\prime}\\ y^{\prime}\\ z^{\prime} \end{array} \right] =\left[ \begin{array} {ccc} A_{11} & A_{12} & A_{13}\\ A_{21} & A_{22} & A_{23}\\ A_{31} & A_{32} & A_{33} \end{array} \right] \left[ \begin{array} {c} x\\ y\\ z \end{array} \right] +\left[ \begin{array} {c} x_{0}^{\prime}\\ y_{0}^{\prime}\\ z_{0}^{\prime} \end{array} \right] ,\tag{6.47}
where x0,y0,z0x_{0}^{\prime},y_{0}^{\prime},z_{0}^{\prime} are the coordinates of OO in the primed coordinate system and (the transpose of) the matrix relates the unprimed and primed coordinate bases according to the formal identity
[ijk]=[A11A12A13A21A22A23A31A32A33]T[ijk].(6.48)\left[ \begin{array} {c} \mathbf{i}\\ \mathbf{j}\\ \mathbf{k} \end{array} \right] =\left[ \begin{array} {ccc} A_{11} & A_{12} & A_{13}\\ A_{21} & A_{22} & A_{23}\\ A_{31} & A_{32} & A_{33} \end{array} \right] ^{T}\left[ \begin{array} {c} \mathbf{i}^{\prime}\\ \mathbf{j}^{\prime}\\ \mathbf{k}^{\prime} \end{array} \right] .\tag{6.48}
We can eschew the unwelcome transpose by organizing the elements of the coordinate bases in rows instead of columns, i.e.
[ijk]=[ijk][A11A12A13A21A22A23A31A32A33].(6.49)\begin{array} {c} \left[ \begin{array} {ccc} \mathbf{i} & \mathbf{j} & \mathbf{k} \end{array} \right] =\left[ \begin{array} {ccc} \mathbf{i}^{\prime} & \mathbf{j}^{\prime} & \mathbf{k}^{\prime} \end{array} \right] \\ \\ \end{array} \left[ \begin{array} {ccc} A_{11} & A_{12} & A_{13}\\ A_{21} & A_{22} & A_{23}\\ A_{31} & A_{32} & A_{33} \end{array} \right] .\tag{6.49}
The proof of this property of affine coordinates is left as an exercise.
Interestingly, the matrix AA participates in the translation from unprimed to primed coordinates and -- note the reverse direction -- from primed to unprimed coordinate bases. Thus, coordinates themselves and their associated bases transform in fundamentally opposite ways. This simple observation, it turns out, will prove to be the cornerstone of the tensor framework.
Polar coordinates r,θr,\theta are well suited for a wide range of geometries in the plane, especially those that are naturally described in terms of the distance to a reference point, such as the star-shaped region in the figure below. A star-shaped region is one for which there exists a fixed point from which all points on the boundary are in a direct line of sight. This allows for a unique mapping between the distance from the fixed point and the direction. Such shapes can be captured in polar coordinates by a single function.
(6.50)
The construction of a polar coordinate system is illustrated in the figure below. Designate an arbitrary point OO as the pole or the origin, and select an arbitrary ray ll, known as the polar axis, emanating from OO. The polar coordinates of a point PP are the numbers rr and θ\theta, where rr is the Euclidean distance from P P\ to the pole OO and θ\theta is the signed angle, measured in radians, between the segment OPOP and the polar axis ll in the counterclockwise direction.
(6.51)
In order to uniquely determine the numerical value of the angle θ\theta, it must be constrained to a semi-open range of length 2π2\pi, such as [0,2π)\left[ 0,2\pi\right) or (π,π]\left( -\pi,\pi\right] . Choosing [0,2π)\left[ 0,2\pi\right) , for example, results in the coordinate lines illustrated in the following figure.
(6.52)
This figure could be made to appear even more regular by choosing radial coordinate lines corresponding to multiples of, say, π/4\pi/4 instead of integer values.
(6.53)
In some applications, such as analysis of curves, it is often more convenient not to restrict the range of θ\theta and to allow it to be any real number. For example, the following figure shows the curve corresponding to the equations
r(γ)=10+γ          (6.54)θ(γ)=γ          (6.55)\begin{aligned}r\left( \gamma\right) & =10+\gamma\ \ \ \ \ \ \ \ \ \ \left(6.54\right)\\\theta\left( \gamma\right) & =\gamma\ \ \ \ \ \ \ \ \ \ \left(6.55\right)\end{aligned}
for the parameter γ\gamma -- and therefore θ\theta -- ranging from 2π-2\pi to 2π2\pi.
(6.56)
Consider the point PP on the curve in the figure above. Had we not already known the equation of the curve, we may think that the θ\theta-coordinate of PP is π/2\pi/2. However, PP corresponds to γ=3π/2\gamma=-3\pi/2 and, therefore, to θ=3π/2\theta=-3\pi/2. Thus, the choice to allow θ\theta to take on arbitrary values results in a great deal of convenience at the cost of uniqueness.
Furthermore, we can also allow the variable rr to take on negative values. By convention, the point PP with coordinates (r,θ)\left( r,\theta\right) , where rr is negative, is found at the point with proper polar coordinates (r,θ+π)\left( -r,\theta+\pi\right) . In other words, for negative rr, we find PP by moving in the "negative" direction along the ray corresponding to the angle θ\theta. A curve given by the equations
r(γ)=3sin2γ          (6.57)θ(γ)=sinγ,          (6.58)\begin{aligned}r\left( \gamma\right) & =3\sin2\gamma\ \ \ \ \ \ \ \ \ \ \left(6.57\right)\\\theta\left( \gamma\right) & =\sin\gamma,\ \ \ \ \ \ \ \ \ \ \left(6.58\right)\end{aligned}
where 0γ2π0\leq\gamma\leq2\pi and therefore rr assumes negative values, is shown in the following figure.
(6.59)
Note that the variable rr changes sign at multiples of π/2\pi/2. For a continuous curve, this change of sign in rr can occur only when the curve passes through the origin.
Cylindrical coordinates extend polar coordinates to three dimensions. In order to construct a cylindrical coordinate system, first, select a plane known as the coordinate plane. The coordinate plane divides the space into two half-spaces. Arbitrarily select one of the half-spaces as positive and the other as negative. Construct a polar coordinate system within the coordinate plane by selecting an arbitrary pole OO and an arbitrary ray ll. The polar angle θ\theta increases in the direction that appears counterclockwise from the positive half-space. Then, to each point PP in the space, assign the coordinates rr, θ\theta, and zz, where rr and θ\theta are the polar coordinates of the orthogonal projection of OPOP onto the coordinate plane and zz is the signed Euclidean distance between PP and the coordinate plane, i.e. zz is positive if PP is found in the positive half-space and negative otherwise.
(6.60)
The cylindrical or longitudinal axis is the straight line orthogonal to the coordinate plane that passes through the origin OO. It consists of the points for which r=0r=0. The distance between a point PP and the cylindrical axis equals rr. The term cylindrical comes from the fact that points characterized by constant rr form a cylinder. The other two families of coordinate surfaces are planes.
A selection of coordinate lines for cylindrical coordinates is shown in the following figure.
(6.61)
A selection of coordinate surfaces is shown in the following figure.
Spherical coordinates, denoted by the letters rr, θ\theta, and φ\varphi, are perfectly intuitive. Using a planetary analogy, the angles θ\theta and φ\varphi correspond to colatitude and longitude on the surface of the Earth, while rr corresponds to the Euclidean distance to the center of the Earth. To construct a spherical coordinate system, start by selecting an arbitrary origin OO. The coordinate rr of the point PP is the distance between OO and PP. Next, select an arbitrary coordinate plane that passes through OO. We will refer to the straight line orthogonal to the coordinate plane that passes through the origin OO as the spherical axis. The angle θ\theta, known as the colatitude, varies from 00 to π\pi and gives the angle between the segment OPOP and the spherical axis. The remaining coordinate is the azimuth φ\varphi which varies from 00 to 2π2\pi. It corresponds to the angle between the projection of PP onto the coordinate plane and a fixed arbitrarily polar axis ll that passes through the origin OO in the coordinate plane.
(6.63)
The points corresponding to a given value of rr form a coordinate sphere. If φ\varphi is fixed in addition to rr, the result is a "meridian" on the corresponding coordinate sphere. If θ\theta is fixed in addition to rr, the result is a "parallel". Neither angle is defined at the origin OO. The azimuth φ\varphi is undefined along the entire spherical axis. The following figure shows one coordinate surface for each variable.
(6.64)
This completes our descriptions of the special coordinate systems that will be featured throughout our narrative.
In Chapter 4, we gave a geometric definition for the gradient U\mathbf{\nabla}U of a scalar field UU, as well as an alternative definition found in most textbooks, where the gradient is defined as the collection of the partial derivatives
(Ux,Uy,Uz)(6.65)\left( \frac{\partial U}{\partial x},\frac{\partial U}{\partial y} ,\frac{\partial U}{\partial z}\right)\tag{6.65}
with respect to Cartesian coordinates. Having now formally introduced the concept of the coordinate basis i,j,k\mathbf{i},\mathbf{j,k} for Cartesian coordinates, we may conjecture that the connection between the two definitions is captured by the equation
U=Uxi+Uyj+Uzk(6.66)\mathbf{\nabla}U=\frac{\partial U}{\partial x}\mathbf{i}+\frac{\partial U}{\partial y}\mathbf{j}+\frac{\partial U}{\partial z}\mathbf{k}\tag{6.66}
that interprets the partial derivatives as the components of U\mathbf{\nabla }U with respect to the coordinate basis i,j,k\mathbf{i},\mathbf{j,k}. This identity is indeed correct in Cartesian coordinates and it is left as an exercise to prove it. However, we must wonder if this relationship continues to be valid in affine coordinates. Furthermore, we are interested in deriving the general analytical expression for the gradient that works in all coordinate systems. We will save the second task for later, but will now show why the above equation does not hold in non-Cartesian affine coordinates.
In the plane, consider two orthogonal affine coordinate systems. Let the first coordinate system be Cartesian coordinates x,yx,y corresponding to the coordinate basis i,j\mathbf{i,j}. For the other coordinate system x,yx^{\prime },y^{\prime} choose the affine coordinates with the coordinate basis i,j\mathbf{i}^{\prime},\mathbf{j}^{\prime} obtained from i,j\mathbf{i,j} by a twofold stretch, i.e.
i=2i          (6.67)j=2j.          (6.68)\begin{aligned}\mathbf{i}^{\prime} & =2\mathbf{i}\ \ \ \ \ \ \ \ \ \ \left(6.67\right)\\\mathbf{j}^{\prime} & =2\mathbf{j.}\ \ \ \ \ \ \ \ \ \ \left(6.68\right)\end{aligned}
In other words, in the new, "primed" coordinate system, integer coordinate lines are two Euclidean units apart. In particular, this means that the primed coordinates x,yx^{\prime},y^{\prime} are given in terms of the unprimed coordinates x,yx,y by the equations
x=12x          (6.69)y=12y.          (6.70)\begin{aligned}x^{\prime} & =\frac{1}{2}x\ \ \ \ \ \ \ \ \ \ \left(6.69\right)\\y^{\prime} & =\frac{1}{2}y.\ \ \ \ \ \ \ \ \ \ \left(6.70\right)\end{aligned}
Notice that, once again, we are observing coordinates and the associated coordinate bases transforming by opposite rules.
The two coordinate systems are illustrated side by side in the following figure. The two plots represent the same scalar field UU which is, of course, independent of the coordinates.
  (6.71)
Let U(x,y)U\left( x,y\right) denote UU as a function of xx and yy and U(x,y)U\left( x^{\prime},y^{\prime}\right) denote UU as a function of xx^{\prime} and yy^{\prime}. Importantly, the functions U(x,y)U\left( x,y\right) and U(x,y)U\left( x^{\prime},y^{\prime}\right) are different functions. For example, if
U(x,y)=xy,(6.72)U\left( x,y\right) =xy,\tag{6.72}
then
U(x,y)=4xy.(6.73)U\left( x^{\prime},y^{\prime}\right) =4x^{\prime}y^{\prime}.\tag{6.73}
Even though the three objects -- the scalar field UU along with the functions U(x,y)U\left( x,y\right) and U(x,y)U\left( x^{\prime},y^{\prime}\right) -- are different, it does make sense to denote them by the same letter UU due to their close relationship.
We are now in a position to compare the values of the expressions
Uxi+Uyj    and    Uxi+Uyj(6.74)\frac{\partial U}{\partial x}\mathbf{i}+\frac{\partial U}{\partial y}\mathbf{j}\text{ \ \ \ and\ \ \ \ }\frac{\partial U}{\partial x^{\prime} }\mathbf{i}^{\prime}+\frac{\partial U}{\partial y^{\prime}}\mathbf{j}^{\prime}\tag{6.74}
Recall that the coordinate basis vectors are related by the equations
i=2i          (6.67)j=2j.          (6.68)\begin{aligned}\mathbf{i}^{\prime} & =2\mathbf{i}\ \ \ \ \ \ \ \ \ \ \left(6.67\right)\\\mathbf{j}^{\prime} & =2\mathbf{j.} \ \ \ \ \ \ \ \ \ \ \left(6.68\right)\end{aligned}
In other words, in the change from the unprimed to the primed coordinates, the coordinate basis vectors double. Thus, the only remaining question is whether the partial derivatives correspondingly double or halve? We will now show that they, too, double and thus the combined expression quadruples.
Let us compare the partial derivatives U(x,y)/x\partial U\left( x,y\right) /\partial x and U(x,y)/x\partial U\left( x^{\prime},y^{\prime}\right) /\partial x^{\prime} at a single point PP with unprimed coordinates (x,y)\left( x,y\right) and primed coordinates (x,y)\left( x^{\prime},y^{\prime}\right) . In each coordinate system, increase the first coordinate by hh, i.e. consider the point PhP_{h} with unprimed coordinates (x+h,y)\left( x+h,y\right) along with the point PhP_{h}^{\prime} with primed coordinates (x+h,y)\left( x^{\prime }+h,y\right) . These distinct points are illustrated in the figure above. Observe that the point PhP_{h}^{\prime} is twice as far from PP as the point PhP_{h}. Thus, the ratio
U(x+h,y)U(x,y)h(6.75)\frac{U\left( x^{\prime}+h,y^{\prime}\right) -U\left( x^{\prime},y^{\prime }\right) }{h}\tag{6.75}
is roughly twice as great as
U(x+h,y)U(x,y)h.(6.76)\frac{U\left( x+h,y\right) -U\left( x,y\right) }{h}.\tag{6.76}
Therefore, in the limit as hh approaches 00, we have
U(x,y)x=2U(x,y)x.(6.77)\frac{\partial U\left( x^{\prime},y^{\prime}\right) }{\partial x^{\prime} }=2\frac{\partial U\left( x,y\right) }{\partial x}.\tag{6.77}
Similarly, for the partial derivatives with respect to the second coordinate, we find
U(x,y)y=2U(x,y)y.(6.78)\frac{\partial U\left( x^{\prime},y^{\prime}\right) }{\partial y^{\prime} }=2\frac{\partial U\left( x,y\right) }{\partial y}.\tag{6.78}
In summary, when we change from the Cartesian coordinates x,yx,y to the affine coordinates x,yx^{\prime},y^{\prime}, both the coordinate vectors and the partial derivatives double. Consequently, the value of the proposed coordinate expression for the gradient quadruples, i.e.
 Uxi+Uyj=4(Uxi+Uyj)(6.79)\text{ }\frac{\partial U}{\partial x^{\prime}}\mathbf{i}^{\prime} +\frac{\partial U}{\partial y^{\prime}}\mathbf{j}^{\prime}=4\left( \frac{\partial U}{\partial x}\mathbf{i}+\frac{\partial U}{\partial y}\mathbf{j}\right)\tag{6.79}
Thus, we have reach the important conclusion that the value of the expression
U(x,y)xi+U(x,y)yj(6.80)\frac{\partial U\left( x,y\right) }{\partial x}\mathbf{i}+\frac{\partial U\left( x,y\right) }{\partial y}\mathbf{j}\tag{6.80}
depends on the particular choice of the coordinate system. In particular, it cannot be the coordinate-space representation of the gradient U\mathbf{\nabla }U.
As we mentioned above, this observation makes it clear that a more effective analytical framework is needed for constructing coordinate-dependent expressions that produce the same value in all coordinate systems. This task of building such a framework will be accomplished in the next few chapters. In particular, the correct coordinate-space expression for the gradient will be given in Chapter 10.
Exercise 6.1Calculate R(γ)\mathbf{R}^{\prime}\left( \gamma\right) for an ellipse with semiaxes aa and bb that corresponds to
R(γ)=acosγ i+bsinγ j.(6.81)\mathbf{R}\left( \gamma\right) =a\cos\gamma\ \mathbf{i}+b\sin\gamma \ \mathbf{j.}\tag{6.81}
What is the length of R(γ)\mathbf{R}^{\prime}\left( \gamma\right) as a function of γ\gamma?
Exercise 6.2Consider a particle moving uniformly around a circle of radius rr making a complete revolution in time TT. Show that its acceleration points towards the center of the circle and has the magnitude rω2r\omega^{2}, where ω=2π/T\omega =2\pi/T.
Exercise 6.3Describe the six degrees of freedom in choosing a Cartesian coordinate system in the three-dimensional space.
Exercise 6.4Show that the vectors i\mathbf{i}, j\mathbf{j}, and k\mathbf{k} of a right-handed Cartesian basis satisfy the equations
i×j=k          (6.82)j×k=i          (6.83)k×i=j.          (6.84)\begin{aligned}\mathbf{i}\times\mathbf{j} & =\mathbf{k}\ \ \ \ \ \ \ \ \ \ \left(6.82\right)\\\mathbf{j}\times\mathbf{k} & =\mathbf{i}\ \ \ \ \ \ \ \ \ \ \left(6.83\right)\\\mathbf{k}\times\mathbf{i} & =\mathbf{j.}\ \ \ \ \ \ \ \ \ \ \left(6.84\right)\end{aligned}
Exercise 6.5Given two affine coordinate systems x,y,zx,y,z and x,y,zx^{\prime},y^{\prime },z^{\prime} with the respective origins at OO and OO^{\prime} and the coordinate bases i,j,k\mathbf{i,j,k} and i,j,k\mathbf{i}^{\prime},\mathbf{j}^{\prime },\mathbf{k}^{\prime} related by
[ijk]=[ijk][A11A12A13A21A22A23A31A32A33],(6.49)\begin{array} {c} \left[ \begin{array} {ccc} \mathbf{i} & \mathbf{j} & \mathbf{k} \end{array} \right] =\left[ \begin{array} {ccc} \mathbf{i}^{\prime} & \mathbf{j}^{\prime} & \mathbf{k}^{\prime} \end{array} \right] \\ \\ \end{array} \left[ \begin{array} {ccc} A_{11} & A_{12} & A_{13}\\ A_{21} & A_{22} & A_{23}\\ A_{31} & A_{32} & A_{33} \end{array} \right] , \tag{6.49}
show that x,y,zx,y,z and x,y,zx^{\prime},y^{\prime},z^{\prime} are related by a combination of a linear transformation and a shift, i.e.
[xyz]=[A11A12A13A21A22A23A31A32A33][xyz]+[x0y0z0],(6.47)\left[ \begin{array} {c} x^{\prime}\\ y^{\prime}\\ z^{\prime} \end{array} \right] =\left[ \begin{array} {ccc} A_{11} & A_{12} & A_{13}\\ A_{21} & A_{22} & A_{23}\\ A_{31} & A_{32} & A_{33} \end{array} \right] \left[ \begin{array} {c} x\\ y\\ z \end{array} \right] +\left[ \begin{array} {c} x_{0}^{\prime}\\ y_{0}^{\prime}\\ z_{0}^{\prime} \end{array} \right] , \tag{6.47}
where x0,y0,z0x_{0}^{\prime},y_{0}^{\prime},z_{0}^{\prime} are the primed coordinates of OO.
Exercise 6.6For a scalar field UU, show that the expression
U(x,y,z)x i+U(x,y,z)y j+U(x,y,z)z k(6.85)\frac{\partial U\left( x,y,z\right) }{\partial x}~\mathbf{i}+\frac{\partial U\left( x,y,z\right) }{\partial y}~\mathbf{j}+\frac{\partial U\left( x,y,z\right) }{\partial z}~\mathbf{k}\tag{6.85}
yields the same vector in all Cartesian coordinates.
Exercise 6.7Furthermore, demonstrate that the expression
U(x,y,z)x i+U(x,y,z)y j+U(x,y,z)z k(6.86)\frac{\partial U\left( x,y,z\right) }{\partial x}~\mathbf{i}+\frac{\partial U\left( x,y,z\right) }{\partial y}~\mathbf{j}+\frac{\partial U\left( x,y,z\right) }{\partial z}~\mathbf{k}\tag{6.86}
corresponds to the gradient U\nabla U of the scalar field UU as defined in Chapter 4.
Exercise 6.8Show that the expression
1iiU(x,y,z)x i+1jjU(x,y,z)y j+1kkU(x,y,z)z k(6.87)\frac{1}{\mathbf{i}\cdot\mathbf{i}}\frac{\partial U\left( x,y,z\right) }{\partial x}~\mathbf{i}+\frac{1}{\mathbf{j}\cdot\mathbf{j}}\frac{\partial U\left( x,y,z\right) }{\partial y}~\mathbf{j}+\frac{1}{\mathbf{k} \cdot\mathbf{k}}\frac{\partial U\left( x,y,z\right) }{\partial z}~\mathbf{k}\tag{6.87}
yields the same vector in all orthogonal affine coordinates, i.e. affine coordinates characterized by an orthogonal coordinate basis i,j,k\mathbf{i,j,k}.
Send feedback to Pavel