4. Differentiation of Vectors

Having laid down the algebraic foundations of geometric vectors, we now turn our attention to differentiation. When most people think of differentiation of vectors, they imagine a component-by-component operation, such as

\mathbf{U}^{\prime}\left( t\right) =\left( U_{1}^{\prime}\left( t\right) ,U_{2}^{\prime}\left( t\right) ,U_{3}^{\prime}\left( t\right) \right)

. This way of thinking is an example of the presumptive association between geometric vectors and their components. It is imperative to break this association and to commit to vectors as geometric objects. As we are about to demonstrate, geometric vectors in a Euclidean space have just enough analytical structure to be meaningfully differentiated. Differentiation of geometric vectors will prove to be a crucial element in our approach to Tensor Calculus.

4.1The mechanics of differentiation

Recall the standard definition of the derivative

U^{\prime}\left( x\right)

of an ordinary function

U\left( x\right)

, i.e.

U^{\prime}\left( x\right) =\lim_{h\rightarrow0}\frac{U\left( x+h\right) -U\left( x\right) }{h}\tag{4.1}

What is required of

U\left( x\right)

in order for this definition to apply? Clearly, the quantity represented by

U\left( x\right)

must be subject to three basic operations: addition, multiplication by numbers, and evaluation of limits. Addition (which implies subtraction) is required in order to evaluate the difference

U\left( x+h\right) -U\left( x\right)

. Multiplication by numbers (which implies division) is required in order to divide the difference

U\left( x+h\right) -U\left( x\right)

h

. Finally, the need for limits is self-evident.

We have already established that geometric vectors can be added and multiplied by numbers. Thus, we only need to confirm that geometric vectors are subject to evaluation of limits. What makes limits of geometric vectors possible is the availability of Euclidean length since it enables us to state how close two vectors are by measuring the distance between them. The distance between two vectors

\mathbf{U}

and

\mathbf{V}

, denoted by

\operatorname{d}\left( \mathbf{U,V}\right)

, is defined as the length of their difference, i.e.

\operatorname{d}\left( \mathbf{U,V}\right) =\operatorname{len}\left( \mathbf{U}-\mathbf{V}\right) .\tag{4.2}

With the distance function in hand, we are able to carry the classical definition of a limit over to geometric vectors.

In formal terms, a vector

\mathbf{U}

is the limit of a sequence

\mathbf{U}_{n}

if the distance between

\mathbf{U}_{n}

and

\mathbf{U}

approaches

0

n

approaches infinity, i.e.

\mathbf{U}=\lim_{n\rightarrow\infty}\mathbf{U}_{n}\text{\ \ \ \ if\ \ \ \ } \lim_{n\rightarrow\infty}\operatorname{d}\left( \mathbf{U}_{n},\mathbf{U} \right) =0.\tag{4.3}

The following figure illustrates a sequence of vectors

\mathbf{U}_{n}

that approach the "vertical" vector

\mathbf{U}

(4.4)

The same approach can be used to define the limit of a vector-valued function

\mathbf{U}\left( \gamma\right)

. Specifically, the vector

\mathbf{U}

is the limit of

\mathbf{U}\left( \gamma\right)

\gamma=\gamma_{0}

if the distance between

\mathbf{U}\left( \gamma\right)

and

\mathbf{U}

approaches

0

\gamma

approaches

\gamma_{0}

, i.e.

\mathbf{U}=\lim_{\gamma\rightarrow\gamma_{0}}\mathbf{U}\left( \gamma\right) \text{\ \ \ \ if\ \ \ \ }\lim_{\gamma\rightarrow\gamma_{0}}\operatorname{d} \left( \mathbf{U}\left( \gamma\right) ,\mathbf{U}\right) =0.\tag{4.5}

Thus, the concept of a limit for vector-valued sequences and functions is established.

Having established the concept of a limit, we can mimic the classical definition

U^{\prime}\left( x\right) =\lim_{h\rightarrow0}\frac{U\left( x+h\right) -U\left( x\right) }{h} \tag{4.1}

of the derivative to give a formal definition of the derivative

\mathbf{U} ^{\prime}\left( \gamma\right)

of a vector-valued function

\mathbf{U} \left( \gamma\right)

that, i.e.

\mathbf{U}^{\prime}\left( \gamma\right) =\lim_{h\rightarrow0}\frac {\mathbf{U}\left( \gamma+h\right) -\mathbf{U}\left( \gamma\right) }{h}.\tag{4.6}

We reiterate that all aspects of the procedure encoded in this definition can, at least conceptually, be carried out by pure geometric means without the use of a coordinate system.

The derivative of a vector function corresponds to our intuition for rate of change, except now the rate of change is a vector quantity. Nevertheless, many of the ideas from ordinary Calculus carry over to vector-valued functions. For example, the derivative can be used for linear approximations. Namely, for a small

h

, the derivative

\mathbf{U}^{\prime }\left( \gamma\right)

can be approximated by the quotient

\frac{\mathbf{U}\left( \gamma+h\right) -\mathbf{U}\left( \gamma\right) }{h}\tag{4.7}

in the sense that the distance between

\mathbf{U}^{\prime}\left( \gamma\right)

and the above quotient is small. Therefore,

\mathbf{U} \left( \gamma+h\right)

can be approximated by the familiar formula

\mathbf{U}\left( \gamma+h\right) \approx\mathbf{U}\left( \gamma\right) +h\mathbf{U}^{\prime}\left( \gamma\right) ,\tag{4.8}

also in the sense that the distance between

\mathbf{U}\left( \gamma+h\right)

and

\mathbf{U}\left( \gamma\right) +h\mathbf{U}^{\prime }\left( \gamma\right)

is small.

4.2The derivative of a vector function as the tangent to a curve

A vector-valued function

\mathbf{U}\left( \gamma\right)

has the inescapable geometric interpretation as the curve traced out by the tips of the vectors

\mathbf{U}\left( \gamma\right)

emanating from a single point

O

known as the origin. In this context, the vectors

\mathbf{U}\left( \gamma\right)

are referred to as the position vectors for the points on the curve. Since it is common to denote position vectors by the letter

\mathbf{R}

rather than

\mathbf{U}

, we will now switch to that convention and consider a vector-valued function

\mathbf{R} \left( \gamma\right)

and refer to it as the vector equation of the curve.

(4.9)

To reiterate: there is a natural one-to-one correspondence between vector-valued functions and curves. Any vector-valued function

\mathbf{R} \left( \gamma\right)

can be visualized as a curve with respect to a fixed origin

O

. Conversely, every curve represents a vector-valued function

\mathbf{R}\left( \gamma\right)

once a fixed origin

O

is chosen and a numerical value of a parameter

\gamma

is assigned to every point on the curve in an appropriate fashion. If the parameter

\gamma

represents time, then the curve associated with the function

\mathbf{R}\left( \gamma\right)

can be interpreted as a trajectory of a moving particle.

Given this interpretation of a vector-valued function

\mathbf{R}\left( \gamma\right)

, what is the meaning of the vector

\mathbf{R}^{\prime }\left( \gamma\right)

? We will now demonstrate

\mathbf{R}^{\prime}\left( \gamma\right)

is a vector that points in the tangential direction to the curve. We intuitively understand the tangent line to be a straight line that "touches" the curve at a single point, typically without crossing it. Alternatively, we can think of the tangent as the straight line that presents itself when one sufficiently zooms in on the curve. Our present goal is to confirm that the definition of

\mathbf{R}^{\prime}\left( \gamma\right)

is consistent with our intuition for the tangent line. We will do so by examining the limiting procedure encoded in the analytical expression for

\mathbf{R} ^{\prime}\left( \gamma\right)

, i.e.

\mathbf{R}^{\prime}\left( \gamma\right) =\lim_{h\rightarrow0}\frac {\mathbf{R}\left( \gamma+h\right) -\mathbf{R}\left( \gamma\right) }{h}.\tag{4.10}

Once our intuition is confirmed, we will reverse the logic and define the tangent direction as the direction of the vector

\mathbf{R}^{\prime }\left( \gamma\right)

. Additionally, if

\mathbf{R}\left( \gamma\right)

is interpreted as the trajectory of a particle, where

\gamma

represents time, then the velocity of the particle will be defined to be

\mathbf{R}^{\prime}\left( \gamma\right)

The following figure shows

\mathbf{R}\left( \gamma\right)

for two points corresponding to two nearby values

\gamma

and

\gamma+h

of the parameter.

(4.11)

The difference

\mathbf{R}\left( \gamma+h\right) -\mathbf{R}\left( \gamma\right)

is the vector from the tip of

\mathbf{R}\left( \gamma\right)

to the tip of

\mathbf{R}\left( \gamma+h\right)

(4.12)

Importantly, note that the difference

\mathbf{R}\left( \gamma+h\right) -\mathbf{R}\left( \gamma\right)

is independent of the location of the origin

O

. This is why the location of

O

is almost always irrelevant. If

\gamma

is thought of as time,

\mathbf{R}\left( \gamma+h\right) -\mathbf{R}\left( \gamma\right)

corresponds to the displacement of the particle over the short period of time

h

Dividing the difference

\mathbf{R}\left( \gamma+h\right) -\mathbf{R}\left( \gamma\right)

h

yields our initial approximation to the eventual vector

\mathbf{R}^{\prime}\left( \gamma\right)

(4.13)

When

\gamma

is thought of as time, the ratio

\frac{\mathbf{R}\left( \gamma+h\right) -\mathbf{R}\left( \gamma\right) }{h}\tag{4.14}

corresponds to the average velocity of the moving particle over a period of time

h

Finally, imagine what happens to the vector represented by the above ratio as

h

begins to approach zero, and observe that its direction begins to approach what we intuitively understand as the tangent to the curve.

(4.15)

Thus, we have justified the interpretation of the derivative

\mathbf{R} ^{\prime}\left( \gamma\right)

of a vector-valued function

\mathbf{R} \left( \gamma\right)

as the tangent to the curve represented by

\mathbf{R}\left( \gamma\right)

4.3Differential analysis of the unit circle

Having gained this important insight, let us determine

\mathbf{R}^{\prime }\left( \gamma\right)

for a specific curve that will prove to be of great importance in our future investigations. Suppose that

\mathbf{R}\left( \gamma\right)

traces out the unit circle where the parameter

\gamma

corresponds to the central angle measured in the counterclockwise direction with respect to an arbitrary ray emanating from the center of the circle.

(4.16)

Since, as we remarked previously,

\mathbf{R}^{\prime}\left( \gamma\right)

is independent of the location of the origin

O

, let us place it at the center of the circle. We have already established that

\mathbf{R}^{\prime }\left( \gamma\right)

is tangential to the circle and it is evident that it points in the counterclockwise direction. Thus, the only remaining quantity to be determined is its length.

If we think of

\gamma

as time and the unit circle as the trajectory of a material particle corresponding to

\mathbf{R}\left( \gamma\right)

, then it is clear that

\mathbf{R}^{\prime}\left( \gamma\right)

is unit length. Indeed, as

\gamma

changes from

0

2\pi

, the particle makes a single revolution and thus travels a distance of

2\pi

. Therefore, its speed, i.e. the magnitude of

\mathbf{R}^{\prime}\left( \gamma\right)

, has the constant value of

\frac{2\pi}{2\pi}=1.\tag{4.17}

In other words,

\mathbf{R}^{\prime}\left( \gamma\right)

is a unit tangent vector that points in the counterclockwise direction.

(4.18)

The same conclusion regarding the magnitude of

\mathbf{R}^{\prime}\left( \gamma\right)

can be reached by a more formal calculation. Consider the configuration involving

\mathbf{R}\left( \gamma\right)

and

\mathbf{R} \left( \gamma+h\right)

for two nearby values of the parameter.

(4.19)

From the isosceles triangle with the vertices at

O

\mathbf{R}\left( \gamma\right)

, and

\mathbf{R}\left( \gamma+h\right)

, we calculate that

\operatorname{len}\left( \mathbf{R}\left( \gamma+h\right) -\mathbf{R} \left( \gamma\right) \right) =2\sin\left( h/2\right)\tag{4.20}

and therefore

\operatorname{len}\frac{\mathbf{R}\left( \gamma+h\right) -\mathbf{R}\left( \gamma\right) }{h}=\frac{\sin\left( h/2\right) }{h/2}.\tag{4.21}

From ordinary Calculus, we know that

\lim_{h\rightarrow0}\frac{\sin\left( h/2\right) }{h/2}=1.\tag{4.22}

Therefore,

\lim_{h\rightarrow0}\left( \operatorname{len}\frac{\mathbf{R}\left( \gamma+h\right) -\mathbf{R}\left( \gamma\right) }{h}\right) =1,\tag{4.23}

and thus

\mathbf{R}^{\prime}\left( \gamma\right)

is indeed unit length, consistent with the earlier conclusion based on our kinematic intuition.

Finally, for a circle of radius

r

, similarly parameterized by the central angle

\gamma

(4.24)

the derivative

\mathbf{R}^{\prime}\left( \gamma\right)

is the tangential vector of length

r

, i.e.

\operatorname{len}\mathbf{R}^{\prime}\left( \gamma\right) =r\tag{4.25}

Proving this fact is left as an exercise.

4.4The laws of vector differentiation

Geometric vectors are subject to three operations: addition, multiplication by scalars, and the dot product. Naturally, to each operation there corresponds its own differentiation rule.

4.4.1The sum and product rules

Consider two vector-valued functions

\mathbf{U}\left( \gamma\right)

and

\mathbf{V}\left( \gamma\right)

. The derivative applied to their sum is governed by the familiar sum rule

\left( \mathbf{U}+\mathbf{V}\right) ^{\prime}=\mathbf{U}^{\prime} +\mathbf{V}^{\prime}.\tag{4.26}

For the product of a vector-valued function with a constant scalar, the differentiation rule reads

\left( c\mathbf{U}\right) ^{\prime}=c\mathbf{U}^{\prime}.\tag{4.27}

If the scalar

c

is itself a function of

\gamma

, then the product rule reads

\left( c\mathbf{U}\right) ^{\prime}=c^{\prime}\mathbf{U}+c\mathbf{U} ^{\prime}.\tag{4.28}

The proofs of these rules are left as exercises. While demonstrating the first two identities is entirely straightforward, the last one may pose a challenge. We recommend applying the same approach that we are about to apply to the dot product rule.

4.4.2The dot product rule

We will now demonstrate that vector-valued functions satisfy the dot product rule

\left( \mathbf{U}\cdot\mathbf{V}\right) ^{\prime}=\mathbf{U}^{\prime} \cdot\mathbf{V}+\mathbf{U}\cdot\mathbf{V}^{\prime},\tag{4.29}

which is entirely analogous to the product rule in ordinary Calculus. Not surprisingly, we will also be able to demonstrate this rule by borrowing an argument from Calculus.

Let

F\left( \gamma\right) =\mathbf{U}\left( \gamma\right) \cdot\mathbf{V} \left( \gamma\right)\tag{4.30}

and consider the difference

F\left( \gamma+h\right) -F\left( \gamma\right) =\mathbf{U}\left( \gamma+h\right) \cdot\mathbf{V}\left( \gamma+h\right) -\mathbf{U}\left( \gamma\right) \cdot\mathbf{V}\left( \gamma\right) .\tag{4.31}

On the right, subtract the combination

\mathbf{U}\left( \gamma\right) \cdot\mathbf{V}\left( \gamma+h\right)

from the first term and add it to the second, i.e.

\begin{aligned}F\left( \gamma+h\right) -F\left( \gamma\right) & =\left( \mathbf{U} \left( \gamma+h\right) \cdot\mathbf{V}\left( \gamma+h\right) -\mathbf{U}\left( \gamma\right) \cdot\mathbf{V}\left( \gamma+h\right) \rule{0pt}{12pt}\right) +\ \ \ \ \ \ \ \ \ \ \\& \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \left( \mathbf{U} \left( \gamma\right) \cdot\mathbf{V}\left( \gamma+h\right) -\mathbf{U} \left( \gamma\right) \cdot\mathbf{V}\left( \gamma\right) \rule{0pt}{12pt} \right) .\ \ \ \ \ \ \ \ \ \ \left(4.32\right)\end{aligned}

By the distributive law,

\begin{aligned}F\left( \gamma+h\right) -F\left( \gamma\right) & =\left( \mathbf{U} \left( \gamma+h\right) -\mathbf{U}\left( \gamma\right) \rule{0pt}{12pt} \right) \cdot\mathbf{V}\left( \gamma+h\right) +\ \ \ \ \ \ \ \ \ \ \\& \ \ \ \ \ \ \ \ \ \ \ \ \mathbf{U}\left( \gamma\right) \cdot\left( \mathbf{V}\left( \gamma+h\right) -\mathbf{V}\left( \gamma\right) \rule{0pt}{12pt}\right) .\ \ \ \ \ \ \ \ \ \ \left(4.33\right)\end{aligned}

Divide both sides by

h

, i.e.

\frac{F\left( \gamma+h\right) -F\left( \gamma\right) }{h}=\frac {\mathbf{U}\left( \gamma+h\right) -\mathbf{U}\left( \gamma\right) } {h}\cdot\mathbf{V}\left( \gamma+h\right) +\mathbf{U}\left( \gamma\right) \cdot\frac{\mathbf{V}\left( \gamma+h\right) -\mathbf{V}\left( \gamma\right) }{h}.\tag{4.34}

Finally, evaluate the limit as

h

approaches zero. By definition, the left side approach

F^{\prime}\left( \gamma\right) =\left( \mathbf{U}\left( \gamma\right) \cdot\mathbf{V}\left( \gamma\right) \rule{0pt}{12pt}\right) ^{\prime}.\tag{4.35}

Meanwhile, the two fractions on the right approach to

\mathbf{U}^{\prime }\left( \gamma\right)

and

\mathbf{V}^{\prime}\left( \gamma\right)

, and

\mathbf{V}\left( \gamma+h\right)

approaches

\mathbf{V}\left( \gamma\right)

. Thus, the right side converges to

\mathbf{U}^{\prime}\left( \gamma\right) \cdot\mathbf{V}\left( \gamma\right) +\mathbf{U}\left( \gamma\right) \cdot\mathbf{V}^{\prime }\left( \gamma\right)\tag{4.36}

leading to the identity

\left( \mathbf{U}\left( \gamma\right) \cdot\mathbf{V}\left( \gamma\right) \rule{0pt}{12pt}\right) ^{\prime}=\mathbf{U}^{\prime}\left( \gamma\right) \cdot\mathbf{V}\left( \gamma\right) +\mathbf{U}\left( \gamma\right) \cdot\mathbf{V}^{\prime}\left( \gamma\right) ,\tag{4.37}

as we set out to show.

4.5The derivative of a constant-length vector function

As one demonstration of the dot product rule, let us show that if

\mathbf{U}\left( \gamma\right)

has constant length then

\mathbf{U} ^{\prime}\left( \gamma\right)

is orthogonal to

\mathbf{U}\left( \gamma\right)

. In algebraic terms, the fact that

\mathbf{U}\left( \gamma\right)

has constant length can be expressed with the help of the dot product by the equation

\mathbf{U}\left( \gamma\right) \cdot\mathbf{U}\left( \gamma\right) =c,\tag{4.38}

where

c

is a constant. An application of the dot product rule yields

\mathbf{U}^{\prime}\left( \gamma\right) \cdot\mathbf{U}\left( \gamma\right) +\mathbf{U}\left( \gamma\right) \cdot\mathbf{U}^{\prime }\left( \gamma\right) =0.\tag{4.39}

The two terms on the left are equal and therefore both equal zero. In other words, the dot product of

\mathbf{U}^{\prime}\left( \gamma\right)

and

\mathbf{U}\left( \gamma\right)

vanishes

\mathbf{U}^{\prime}\left( \gamma\right) \cdot\mathbf{U}\left( \gamma\right) =0.\tag{4.40}

In other words,

\mathbf{U}^{\prime}\left( \gamma\right)

is indeed orthogonal to

\mathbf{U}\left( \gamma\right)

Orthogonality between

\mathbf{U}\left( \gamma\right)

and

\mathbf{U} ^{\prime}\left( \gamma\right)

for a constant-length function

\mathbf{U}\left( \gamma\right)

can also be derived from the previously observed fact that

\mathbf{U}^{\prime}\left( \gamma\right)

is tangential to the curve traced out by

\mathbf{U}\left( \gamma\right)

when the vectors

\mathbf{U}\left( \gamma\right)

emanate from a fixed point

O

. When the vectors represented by a constant-length function

\mathbf{U}\left( \gamma\right)

are arranged in such a way, the tips of

\mathbf{U}\left( \gamma\right)

trace out a circle, although not necessarily in a constant-speed fashion encountered in the previous Section. Since

\mathbf{U}^{\prime}\left( \gamma\right)

is tangential to the circle, it must be orthogonal to

\mathbf{U}\left( \gamma\right)

as the latter points in the radial direction.

For another interesting interpretation of orthogonality between a constant-length vector and its derivative, imagine a particle moving along a curved path with constant speed. Because the particle's trajectory is not straight, its velocity

\mathbf{V}\left( \gamma\right) =\mathbf{R}^{\prime}\left( \gamma\right)

is not constant. However, the magnitude of the velocity, i.e. speed, can remain constant even along a curved trajectory.

(4.41)

As we have established previously, the velocity vector

\mathbf{V}\left( \gamma\right)

is tangential to the trajectory. Its derivative

\mathbf{V}^{\prime}\left( \gamma\right)

, which is orthogonal to

\mathbf{V}

, is the acceleration

\mathbf{A}\left( \gamma\right)

. Thus, the fact established in this Section shows that the acceleration of a particle moving with constant speed is orthogonal to the trajectory.

We will now turn our attention to the important concepts of the directional derivative and the gradient of a scalar field.

4.6The directional derivative

A field is a quantity defined at every point of a domain in a Euclidean space. We will consider fields of scalar, vector, and (eventually) variant quantities. An example of a scalar field is the temperature distribution in a room. The following figure shows a density plot of a two-dimensional scalar field, where the color of a given point corresponds to the value of the field. Meanwhile, the plotted contours, known as level sets, corresponds to a particular fixed value of the function defining the field.

(4.42)

An example of a vector field is the distribution of velocities in a fluid flow. A vector field is usually represented by plotting the vectors of the field at a strategic sampling of points.

(4.43)

The concept of the directional derivative can be applied to a field of any kind. As the name suggests, it measures the rate of change of the field in a particular direction.

Consider a scalar field

U

defined in a Euclidean space. We will now construct its directional derivative at a point

P

in the direction indicated by a unit vector

\mathbf{L}

. Let

l

be the ray that emanates from the point

P

in the direction of

\mathbf{L}

. The following figure shows these elements as well as the density plot for

U

(4.44)

For a small positive number

h

, find the point

P^{\ast}

along

l

whose Euclidean distance to

P

equals

h

. Denote the values of

U

at the two points by

U\left( P\right)

and

U\left( P^{\ast}\right)

(4.45)

The difference

U\left( P^{\ast}\right) -U\left( P\right)

represents the change in

U

from

P

P^{\ast}

, while the ratio

\frac{U\left( P^{\ast}\right) -U\left( P\right) }{h}\tag{4.46}

can be thought of as the average rate of change. As

h

approaches

0

, this quantity approaches the instantaneous rate of change of

U

in the direction

l

. This quantity is denoted by

dU/dl

and is known as the directional derivative of

U

in the direction

l

. In formal terms,

\frac{dU}{dl}=\lim_{h\rightarrow0}\frac{U\left( P^{\ast}\right) -U\left( P\right) }{h}.\tag{4.47}

The same definition can be applied to a vector field textbf{

U

}, i.e.

\frac{d\mathbf{U}}{dl}=\lim_{h\rightarrow0}\frac{\mathbf{U}\left( P^{\ast }\right) -\mathbf{U}\left( P\right) }{h}.\tag{4.48}

Note, once again, that vectors are subject to each of the operations featured on the right side of this equation. Also note that, like most of the concepts introduced in this book so far, the directional derivative is defined in pure geometric terms without a reference to a coordinate system.

Rather than directly relying on the concept of a limit, the directional derivative can be defined in terms of the ordinary derivative. Identify with every point in the Euclidean space the position vector

\mathbf{R}

emanating from an arbitrary origin

O

so that the scalar field

U

can be thought of as a function

U\left( \mathbf{R}\right)

of the position vector. Denote the position vector associated with the point

P

\mathbf{R}_{0}

. Then the expression

\mathbf{R}_{0}+s\mathbf{L}

captures the points along the ray

l

while

U\left( \mathbf{R}_{0}+s\mathbf{L}\right)

captures the corresponding values of

U

. Since

\mathbf{L}

is a unit vector, the parameter

s

represents the distance to the point

P

. Thus, the directional derivative

dU/dl

can be defined as the ordinary derivative of

U\left( \mathbf{R}_{0}+s\mathbf{L}\right)

with respect to

s

, i.e.

\frac{dU}{dl}=\frac{d}{ds}U\left( \mathbf{R}_{0}+s\mathbf{L}\right) .\tag{4.49}

Naturally, the same idea applies to vector fields. For a vector field textbf{

U

}, consider the vector-valued function textbf{

U

}

\left( \mathbf{R}_{0}+s\mathbf{L}\right)

and define

d

textbf{

U

}

/dl

as the derivative with respect to

s

of textbf{

U

}

\left( \mathbf{R} _{0}+s\mathbf{L}\right)

4.7Directional derivative examples

Let us now consider several examples involving the directional derivative. In addition to serving as concrete illustrations of the concept, these examples will offer two further benefits. First, they will reinforce the pure geometric nature of our narrative. Second, they will yield results essential for a number of future applications, including the introduction of the covariant basis in Chapter 9 and of the Christoffel symbol in Chapter 12.

4.7.1Directional derivative example $1$

For the first example, let

U\left( P\right)

be the Euclidean distance

d

between

P

and a fixed point

O

, and determine

dU/dl

for

l

that points directly away from

O

(4.50)

The above figure illustrates

U\left( P\right)

and shows the points

P^{\ast}

featured in the definition

\frac{dU}{dl}=\lim_{h\rightarrow0}\frac{U\left( P^{\ast}\right) -U\left( P\right) }{h} \tag{4.47}

of the direction derivative.

The distance between

O

and

P^{\ast}

d+h

, i.e.

U\left( P^{\ast}\right) =d+h.\tag{4.51}

Consequently,

\frac{U\left( P^{\ast}\right) -U\left( P\right) }{h}=\frac{\left( d+h\right) -d}{h}=\frac{h}{h}=1\text{ for all }h.\tag{4.52}

Therefore,

\lim_{h\rightarrow0}\frac{U\left( P^{\ast}\right) -U\left( P\right) } {h}=1.\tag{4.53}

In words, the derivative

dU/dl

U

in the direction away from

O

equals

1

at all points, i.e.

\frac{dU}{dl}=1.\tag{4.54}

4.7.2Directional derivative example $2$

For a second example, consider the same scalar field

U

as in the previous example, but let the ray

l

point in the counterclockwise orthogonal direction to the segment

OP

. The construction necessary for evaluating

dU/dl

is shown in the following figure.

(4.55)

By the Pythagorean theorem, the distance between

O

and

P^{\ast}

\sqrt{d^{2}+h^{2}}

, i.e.

U\left( P^{\ast}\right) =\sqrt{d^{2}+h^{2}}.\tag{4.56}

Thus,

dU/dl

is given by

\frac{dU}{dl}=\lim_{h\rightarrow0}\frac{\sqrt{d^{2}+h^{2}}-d}{h}.\tag{4.57}

It is a matter of ordinary Calculus to show that the limit vanishes and, therefore,

\frac{dU}{dl}=0.\tag{4.58}

4.7.3Directional derivative example $3$

Let us now consider an example involving a vector field. Choose an arbitrary fixed point

O

and let textbf{

U

}

\left( P\right)

be the vector that points in the counterclockwise direction orthogonal to the segment

OP

and has the length that equals the distance to the point

O

. Calculate

d

textbf{

U

}

/dl

for

l

pointing directly away from

O

(4.59)

The following figure shows the values of the field textbf{

U

} at two nearby points

P

and

P^{\ast}

along the ray

l

(4.60)

The vectors textbf{

U

}

\left( P\right)

and textbf{

U

}

\left( P^{\ast }\right)

point in the same direction and, therefore, the difference

\mathbf{U}\left( P^{\ast}\right) -\mathbf{U}\left( P\right)

also points in that direction. Since the lengths of textbf{

U

}

\left( P\right)

and textbf{

U

}

\left( P^{\ast}\right)

are

d

and

d+h

, the length of

\mathbf{U}\left( P^{\ast}\right) -\mathbf{U}\left( P\right)

h

. Consequently, the length of the ratio

\frac{\mathbf{U}\left( P^{\ast}\right) -\mathbf{U}\left( P\right) }{h}\tag{4.61}

1

for all

h

. We can, therefore, conclude that

d

textbf{

U

}

/dl

is a unit vector pointing in the counterclockwise direction orthogonal to

OP

as illustrated in the following figure.

(4.62)

4.7.4Directional derivative example $4$

Finally, let us calculate the directional derivative of the same vector field textbf{

U

} as in the previous example along the ray that points in the counterclockwise direction orthogonal to

OP

. The following figure shows two nearby points

P

and

P^{\ast}

along the ray

l

and the corresponding vectors textbf{

U

}

\left( P\right)

and textbf{

U

}

\left( P^{\ast }\right)

(4.63)

This time, the distance between

O

and

P^{\ast}

\sqrt{d^{2}+h^{2}}

and, therefore, the length of textbf{

U

}

\left( P^{\ast}\right)

\sqrt{d^{2}+h^{2}}

. Shift the tail of textbf{

U

}

\left( P^{\ast}\right)

to the point

P

and construct the difference

\mathbf{U}\left( P^{\ast }\right) -\mathbf{U}\left( P\right)

by connecting the tips of textbf{

U

}

\left( P\right)

and textbf{

U

}

\left( P^{\ast}\right)

(4.64)

Observe that the triangle

OPP^{\ast}

is congruent to the triangle with the vertices at

P

and the tips of textbf{

U

}

\left( P\right)

and textbf{

U

}

\left( P^{\ast}\right)

by the two sides and the included angle criterion. Consequently, the vector

\mathbf{U}\left( P^{\ast}\right) -\mathbf{U}\left( P\right)

is orthogonal to textbf{

U

}

\left( P\right)

and has length

h

. Therefore, the vector

\frac{\mathbf{U}\left( P^{\ast}\right) -\mathbf{U}\left( P\right) }{h}\tag{4.65}

is orthogonal to textbf{

U

}

\left( P\right)

and has length

1

. We thus conclude that

d

textbf{

U

}

/dl

is a unit vector that points towards

O

, as illustrated in the following figure.

(4.66)

4.8The directional derivative formula

We will now show that for any smooth scalar field

U

, the directional derivative

dU/dl

at a point

P

in the direction of the unit vector

\mathbf{L}

is given by the equation

\frac{dU}{dl}=\mathbf{G}\cdot\mathbf{L,}\tag{4.67}

where the vector

\mathbf{G}

depends on the point

P

but not the direction

\mathbf{L}

. In the next Section, we will identify the vector

\mathbf{G}

with the concept of the gradient.

Our derivation will rely on the fundamental idea underlying Calculus that in a small neighborhood of

x_{0}

, an ordinary function

U\left( x\right)

is given by

U\left( x\right) =U\left( x_{0}\right) +k\left( x-x_{0}\right) +o\left( h\right) ,\tag{4.68}

where

k

is the derivative of

U\left( x\right)

x_{0}

h

equals

x-x_{0}

, and

o\left( h\right)

is a quantity that approaches

0

faster than

h

. In other words, over a "sufficiently small" interval, a smooth function

U\left( x\right)

is "essentially linear".

An analogous statement can be made for a scalar field in a Euclidean space. Namely, over a "sufficiently small" neighborhood, a scalar field

U

is "essentially linear". In order to express this insight analytically, once again treat

U

as a function of the position vector

\mathbf{R}

emanating from an arbitrary point

O

and denote the position vector associated with the point

P

\mathbf{R}_{0}

. Then, in a small neighborhood of

P

U\left( \mathbf{R}\right)

essentially equals the sum of its value at

\mathbf{R} _{0}

and a linear function of the difference

\mathbf{R} -\mathbf{R}_{0}

. In other words,

U\left( \mathbf{R}\right)

is captured by the equation

U\left( \mathbf{R}\right) =U\left( \mathbf{R}_{0}\right) +\text{linear function of }\left( \mathbf{R}-\mathbf{R}_{0}\right) +o\left( h\right) ,\tag{4.69}

where

h

is the length of

\mathbf{R}-\mathbf{R}_{0}

and

o\left( h\right)

is once again a quantity that approaches

0

faster than

h

As we demonstrated in Exercise 2.5, a linear function of a vector argument can be uniquely expressed by a dot product with a fixed vector

\mathbf{G}

. Thus,

U\left( \mathbf{R}\right)

is given by

U\left( \mathbf{R}\right) =U\left( \mathbf{R}_{0}\right) +\mathbf{G} \cdot\left( \mathbf{R}-\mathbf{R}_{0}\right) +o\left( h\right) .\tag{4.70}

\mathbf{L}

is the unit vector pointing in the same direction as

\mathbf{R}-\mathbf{R}_{0}

, i.e.

\mathbf{R}-\mathbf{R}_{0}=h\mathbf{L,}\tag{4.71}

then the previous identity can be rewritten as

U\left( \mathbf{R}_{0}+h\mathbf{L}\right) =U\left( \mathbf{R}_{0}\right) +h\mathbf{G}\cdot\mathbf{L}+o\left( h\right) .\tag{4.72}

Dividing both sides by

h

, we find

\frac{U\left( \mathbf{R}_{0}+h\mathbf{L}\right) -U\left( \mathbf{R} _{0}\right) }{h}=\mathbf{G}\cdot\mathbf{L}+\frac{o\left( h\right) }{h}.\tag{4.73}

h

approaches

0

, the left side approaches

dU/dl

while the right side approaches

\mathbf{G}\cdot\mathbf{L}

. Therefore, in the limit, we have

\frac{dU}{dl}=\mathbf{G}\cdot\mathbf{L,} \tag{4.67}

as we set out to prove.

Comparing the equation

U\left( \mathbf{R}_{0}+h\mathbf{L}\right) =U\left( \mathbf{R}_{0}\right) +h\mathbf{G}\cdot\mathbf{L}+o\left( h\right)\tag{4.74}

with its "ordinary" counterpart

U\left( x_{0}+h\right) =U\left( x_{0}\right) +U^{\prime}\left( x_{0}\right) h+o\left( h\right) ,\tag{4.75}

we note that the vector

\mathbf{G}

plays a role analogous to that of

U^{\prime}\left( x_{0}\right)

. Thus, in a sense, we can think of

\mathbf{G}

as the "vector derivative" of the function

U\left( \mathbf{R}\right)

. This interpretation will be further cemented by the introduction of the concept of the gradient, which is our next topic.

4.9The gradient of a scalar field

4.9.1A geometric definition

For scalar fields, the directional derivative leads to the crucial concept of the gradient of a scalar field

U

. Let

l

be the direction of the greatest increase in a scalar field

U

at a point

P

. Then, by definition, the gradient

\mathbf{\nabla}U

, also denoted by

\operatorname{grad}U

, is a vector of length

dU/dl

that points in the direction

l

(4.76)

The concept of the gradient applies only to scalar fields since, as we discussed above, vectors cannot be compared in the same sense as numbers, i.e. for two vectors, there is no rule for determining which one is "greater".

Yet again, note the geometric nature of the newly introduced concept. For a given scalar field

U

, the gradient can be evaluated, at least conceptually, by pure geometric means without a reference to a coordinate system. Thus, our approach differs from that found in most textbooks where the gradient is defined as a collection of partial derivatives. Later in this Chapter, we will begin the task of reconciling the two approaches.

In the previous Section, we showed that the directional derivative of a scalar field

U

at a point

P

in the direction of the unit vector

\mathbf{L}

is given by the dot product

\frac{dU}{dl}=\mathbf{G}\cdot\mathbf{L,} \tag{4.67}

where the vector

\mathbf{G}

is independent of

\mathbf{L}

. You may not be surprised to find out that

\mathbf{G}

equals the gradient

\mathbf{\nabla} U

. Since

\mathbf{L}

is unit length, the definition of the dot product tells us that

\frac{dU}{dl}=\operatorname{len}\mathbf{G}\ \cos\gamma,\tag{4.77}

where

\gamma

is the angle between

\mathbf{G}

and

\mathbf{L}

. Since the greatest value of

\cos\gamma

1

and occurs when

\gamma=0

, the greatest possible value of

dU/dl

\operatorname{len}\mathbf{G,}\tag{4.78}

and also occurs when

\gamma=0

, i.e. when the unit vector

\mathbf{L}

points in the direction of

\mathbf{G}

. Thus, the vector

\mathbf{G}

indeed represents the direction of the greatest increase in

U

and, furthermore, has the magnitude that equals the rate of the greatest increase. In other words,

\mathbf{G}

is the gradient of

U

, as we set out to show. Having established this important connection, we can rewrite the directional derivative formula

\frac{dU}{dl}=\mathbf{G}\cdot\mathbf{L,} \tag{4.67}

in the form

\frac{dU}{dl}=\mathbf{\nabla}U\cdot\mathbf{L}.\tag{4.79}

One of the insights offered by this formula is the fact that knowing the value of the gradient at a given point is sufficient for determining the directional derivative

dU/dl

in any direction

l

. Furthermore, the directional derivative of a scalar field is

0

in any direction orthogonal to the gradient. Conversely, if the directional derivative in a particular direction

l

0

, then the gradient is orthogonal to

l

, provided that the gradient is not zero. In particular, the gradient is orthogonal to the level sets of

U

4.10Gradient examples

For the first example, again consider the function

U\left( P\right)

defined as the distance between the point

P

and an arbitrary fixed point

O

. It is intuitively clear that the gradient of

U\left( P\right)

at a point

P

is a unit vector that points directly away from the point

O

(4.80)

It is left as an exercise to demonstrate this fact analytically. Note that the gradient field

\mathbf{\nabla}U

is not defined at

O

, where it experiences a non-removable discontinuity.

For a second example, choose a fixed point

O

along with a ray

l

emanating from

O

, and let

U\left( P\right)

be the angle between the segment

OP

and the ray

l

, subject to the condition that the angle varies between

0

and

2\pi

and is measured in the counterclockwise direction.

(4.81)

It is intuitively clear that the gradient points in the counterclockwise direction orthogonal to the segment

OP

and that its magnitude is inversely proportional to the distance

d

between

O

and

P

. To calculate the precise magnitude of

\mathbf{\nabla}U

, note that a step of

h

in the counterclockwise orthogonal direction from

P

results in the change in

U

that equals

\arctan\left( h/d\right)

(4.82)

Thus, rate of change is given by the limit

\lim_{h\rightarrow0}\frac{\arctan\left( h/d\right) }{h}.\tag{4.83}

It is a matter of ordinary Calculus to show that this limit equals

1/d

, i.e.

\operatorname{len}\mathbf{\nabla}U=\frac{1}{d}.\tag{4.84}

The resulting gradient field is illustrated in the following figure.

(4.85)

Once again,

\mathbf{\nabla}U

is discontinuous at

O

4.11The coordinate representation of the gradient

In all likelihood, in your first encounter with the gradient, it was defined as the collection of partial derivatives

\left( \frac{\partial U}{\partial x},\frac{\partial U}{\partial y} ,\frac{\partial U}{\partial z}\right)\tag{4.86}

with respect to Cartesian coordinates

x,y,z

. From this definition, it is then demonstrated that the elements of

\mathbf{\nabla}U

represent the components of a vector that points in the direction of the greatest increase of

U

, while the magnitude of that vector equals the corresponding greatest rate of increase. In other words, what we have adopted as the definition of the gradient appears as a consequence in the conventional approach.

Given that we now have two alternative definitions of the gradient, we must find a way to reconcile them. Furthermore, we ought to generalize the coordinate space expression for the gradient to general non-Cartesian coordinates. Looking ahead, the former task will be accomplished in Chapter 6 on coordinate systems while the latter will be accomplished in Chapter 9 in which some of the most fundamental coordinate-dependent objects will be introduced.

4.12Exercises

Exercise 4.1Show that for a circle of radius

r

described by the vector equation of the curve

\mathbf{R}\left( \gamma\right)

, where

\gamma

is the central angle, the vector

\mathbf{R}^{\prime}\left( \gamma\right)

is tangential to the circle and has magnitude

r

Exercise 4.2For the same equation of the curve

\mathbf{R}\left( \gamma\right)

, describe the vectors

\mathbf{R}^{\prime\prime}\left( \gamma\right)

and

\mathbf{R}^{\prime\prime\prime}\left( \gamma\right)

in similar geometric terms.

4.12.1Properties of vector differentiation

Exercise 4.3Show that the derivative of vector-valued functions satisfies the sum rule

\left( \mathbf{U}+\mathbf{V}\right) ^{\prime}=\mathbf{U}^{\prime} +\mathbf{V}^{\prime}. \tag{4.26}

Exercise 4.4Show that for a constant number

c

, the derivative of vector-valued functions satisfies the rule

\left( c\mathbf{U}\right) ^{\prime}=c\mathbf{U}^{\prime}. \tag{4.27}

Exercise 4.5Show that for a scalar function

c\left( \gamma\right)

, the derivative of vector-valued functions satisfies the rule

\left( c\mathbf{U}\right) ^{\prime}=c^{\prime}\mathbf{U}+c\mathbf{U} ^{\prime}. \tag{4.28}

Exercise 4.6Show that for a scalar function

\xi\left( \gamma\right)

, the derivative of the composite vector-valued function

\mathbf{U}\left( \xi\left( \gamma\right) \right)

satisfies the chain rule

\frac{d}{d\gamma}\mathbf{U}\left( \xi\left( \gamma\right) \right) =\mathbf{U}^{\prime}\left( \xi\left( \gamma\right) \right) \xi^{\prime }\left( \gamma\right) . \tag{4.28}

Exercise 4.7Show that the derivative of vector-valued functions applied to the cross product satisfies the product rule

\left( \mathbf{U}\times\mathbf{V}\right) ^{\prime}=\mathbf{U}^{\prime} \times\mathbf{V}+\mathbf{U}\times\mathbf{V}^{\prime}.\tag{4.87}

4.12.2Directional derivative and gradient exercises

Exercise 4.8Given two fixed points

A

and

B

in a three-dimensional space, let

U\left( P\right)

be the area of the triangle

ABP

. Evaluate

dU/dl

for

\mathbf{L}

pointing in the direction parallel to

AB

Exercise 4.9For the same function

U\left( P\right)

, evaluate

dU/dl

for

\mathbf{L}

pointing in the direction orthogonal to and away from

AB

within the plane of the triangle

ABP

Exercise 4.10For the same function

U\left( P\right)

, describe the gradient

\mathbf{\nabla}U

in geometric terms. Show that if the vectors

\mathbf{P}

\mathbf{A}

, and

\mathbf{B}

correspond to the points

P

A

, and

B

, then

\mathbf{\nabla}U

points in the direction of the vector

\mathbf{P}+\frac{\left( \mathbf{P}-\mathbf{A}\right) \cdot\left( \mathbf{A}-\mathbf{B}\right) }{\left( \mathbf{A}-\mathbf{B}\right) \cdot\left( \mathbf{A}-\mathbf{B}\right) }\mathbf{B}+\frac{\left( \mathbf{P}-\mathbf{B}\right) \cdot\left( \mathbf{B}-\mathbf{A}\right) }{\left( \mathbf{B}-\mathbf{A}\right) \cdot\left( \mathbf{B}-\mathbf{A} \right) }\mathbf{A}\tag{4.88}

and has the same magnitude as the vector

\left( \mathbf{B}-\mathbf{A} \right) /2

Exercise 4.11Given two fixed points

A

and

B

, let

U\left( P\right) =AP+BP

. Describe the direction and the magnitude of

\mathbf{\nabla}U

, illustrated in the following figure, in geometric terms.

(4.89)

Exercise 4.12Confirm that

dU/dl=\mathbf{\nabla}U\cdot\mathbf{L}

for the first two examples in Section 4.7.

4.12.3Motion of a material particle

Exercise 4.13In Section 4.5, we showed that the acceleration of a particle moving with constant speed is orthogonal to its trajectory. Show that the converse is also true: if the acceleration of a particle is orthogonal to its trajectory, then its speed is constant.

Exercise 4.14Consider a particle moving with constant speed. Show that

\operatorname{len}^{2}\mathbf{A}\left( t\right) =-\mathbf{V}^{\prime\prime }\left( t\right) \cdot\mathbf{V}\left( t\right) ,\tag{4.90}

where

\mathbf{A}\left( t\right) =\mathbf{V}^{\prime}\left( t\right)

is the acceleration vector.

Exercise 4.15Show that the trajectory of a particle moving in the plane with constant speed and non-zero acceleration of constant magnitude is a circle.

4.12.4Classical optimization problems

The minimization problems in the exercises below are intended to be solved by vector differentiation and therefore require the following caveats commonly found in optimization problems subject to differential analysis. First, all relevant curves must be sufficiently smooth. Second, the minimum is meant in the local sense. Third, in problems related to curves, the optimal point must lie in the interior of a curve.

Furthermore, most of the minimization problems below, when restated as maximization problems, would yield the same criterion. However, in many common situation -- for instance, when the relevant curves are unbounded -- a maximal solution does not exist while a minimal solution exists under a broader range of conditions. This is one of the reasons why most problems are stated as minimization problem.

Also note that in each of the problems, the criterion for a minimum can be stated in pure geometric terms. In fact, you will observe that such a geometric interpretation will always be immediate and obvious conclusions of your differential analysis. This is a significant strength of our approach based on working with geometric vector quantities.

Finally, we should note that a geometric optimality criterion typically does not offer an algorithm for constructing the optimal solution. Of course, this is typical of solving optimization problems by differential analysis. Indeed, in ordinary Calculus, the extremal points of a function

f\left( x\right)

are given by the equation

f^{\prime}\left( x\right) =0

-- however, Calculus does not tell us how to solve this equation.

Exercise 4.16Given a point

A

and a curve

\Gamma

, suppose that a point

B

that is closest to

A

among all points on

\Gamma

. Demonstrate that the location of

B

is characterized by the fact that the segment

AB

is orthogonal to

\Gamma

(4.91)

Exercise 4.17Given two non-intersecting curves

\Gamma_{1}

and

\Gamma_{2}

, suppose that the segment

AB

represents the shortest distance between the two curves. Demonstrate that

AB

is orthogonal to each curve.

(4.92)

Exercise 4.18Heron's problem: given two points

A

and

B

on one side of a straight line

l

, suppose that the point

C

l

minimizes

AC+BC

. Show that the location of

C

is characterized by the "angle of incidence equals the angle of refraction" condition. Does your solution generalize from a straight line

l

to a curve

\Gamma

Exercise 4.19The Torricelli point: given a triangle

ABC

, where the largest angle is below

120^{\circ}

, suppose that the point

X

minimizes the sum of the distances from

X

to the vertices. Demonstrate that

X

is "equiangular" with respect to the vertices, i.e.

\angle AXB=\angle BXC=\angle CXA=120^{\circ}.\tag{4.93}

Differentiation of Vectors

4.1The mechanics of differentiation

4.2The derivative of a vector function as the tangent to a curve

4.3Differential analysis of the unit circle

4.4The laws of vector differentiation

4.4.1The sum and product rules

4.4.2The dot product rule

4.5The derivative of a constant-length vector function

4.6The directional derivative

4.7Directional derivative examples

4.7.1Directional derivative example 111

4.7.2Directional derivative example 222

4.7.3Directional derivative example 333

4.7.4Directional derivative example 444

4.8The directional derivative formula

4.9The gradient of a scalar field

4.9.1A geometric definition

4.10Gradient examples

4.11The coordinate representation of the gradient

4.12Exercises

4.12.1Properties of vector differentiation

4.12.2Directional derivative and gradient exercises

4.12.3Motion of a material particle

4.12.4Classical optimization problems

4.7.1Directional derivative example $1$

4.7.2Directional derivative example $2$

4.7.3Directional derivative example $3$

4.7.4Directional derivative example $4$