20211121

 


[begin transmission 4/?]

At long last, we're in a good position to discuss the discipline which has so thoroughly captured my heart.
We're ready to begin talking about optimal control theory.
Again, I offer the disclaimer that there will be several details and subject matter omitted.
This is simply as a consequence of the discipline being...somewhat involved, as you'll soon be able to tell.

But you'll get the basic idea, I promise.


A Brief History of Optimal Control Theory

Of course, we HAVE to talk about a little bit of the history of optimal control. It's non-negotiable. So, informally, optimal control is said to have its roots in ancient times. According to Virgil's Aeneid, the would-be queen of Carthage--Queen Dido--fled her Phoenician kingdom following the murder of her husband and usurpation of the throne by her brother. She arrived in Libya where she bargained w/ the king to be allocated as much land that could covered by the hide of an ox; as you could imagine, this isn't very much. Being the sly gal that she was, Dido cut the ox hide into thin strips and tied them together into a rope, and enclosed a sizable plot of land along the shoreline. Hence, the birth of the first recorded optimization problem: given two fixed endpoints and a curve of fixed length, construct a curve such that the area between the curve and the shoreline is maximized.

Figure 1. Queen Dido's optimization problem.

Of course, this is all according to Roman legend; who knows if it actually happened. No, canonically, optimal control has its origin in the 17th century, w/ the brachistochrone problem. Yet again, we return to the ginormous egos intellects of the Enlightenment era; Johann Bernoulli submitted a challenge to all of his mathematician colleagues in the June 1696 issue of Acta Eruditorum:
Invitation to all mathematicians to solve a new problem.
If in a vertical plane two points A and B are given, then it is required to specify the orbit AMB of the movable point M, along which it, starting from A, and under the influence of its own weight, arrives at B in the shortest possible time. So that those who are keen of such matters will be tempted to solve this problem, is it good to know that it is not, as it may seem, purely speculative and without practical use. Rather it even appears, and this may be hard to believe, that it is very useful also for other branches of science than mechanics. In order to avoid a hasty conclusion, it should be remarked that the straight line is certainly the line of shortest distance between A and B, but it is not the one which is traveled in the shortest time. However, the curve AMB--which I shall divulge if by the end of this year nobody else has found it--is very well known among geometers.
Figure 2. The brachistochrone problem.

Six of the greatest mathematicians of the time offered solutions to the challenge: Johann Bernoulli himself, Bernoulli's elder brother Jakob, Leibniz, Tschirnhaus, l'Hospital, and Newton. The solutions were published a year later in the May 1697 issue. What I personally adore about this story is how apparently Newton published his solution anonymously in Philosophy Transactions of the same year, and Bernoulli was able to discern that it was Newton's work. How? Why, "ex ungue leonum", of course: you can tell the lion by its claws. It speaks to just how astonishingly intelligent, creative, and unique these intellectuals were in their thinking, and most of them were keen on studying up to such an extent that they could readily identify each other through their work. Keep in mind this was during an age where gaining access to information was onerous and a privilege for only a few; to be well-studied meant that one really had to have the interest, motivation, and passion to pursue the subject.

Though, I suppose a burning hatred for Newton would also foster the same level of scrutiny of the work. Johann Bernoulli absolutely despised Newton, to the extent that he even sharply chastised his own son, Daniel, for admiring Newton's work. Imagine being so salty towards the brilliance of another that it spills over to affect the successor generation...

Anyway, the brachistochrone problem posited by Bernoulli reflected the kinds of  mathematical and physical problems academics were concerned w/ during the time period, and his solution would inspire the framework for a mathematical discipline we now know as calculus of variations. This field was largely developed by Euler and Lagrange; Euler was a student of Bernoulli's and Lagrange became interested in variational problems by reading Euler.

Fast forward to 1958. The Cold War between the U.S. and the Soviet Union is well underway, with each faction trying to one-up each other in terms of scientific and technological superiority. In this year, the Soviets presented to the International Congress of Mathematicians a "new" idea in dynamic programming/optimization (quotes here b/c it wasn't really new, but rather a slight modification to calculus of variations) that became known as Pontryagin's Maximum Principle. This idea proved revolutionary, since prior to the discovery the field of dynamic programming/optimization was stagnating after the development of the Hamilton-Jacobi-Bellman equation earlier in the 50s by the Americans. We won't get into the weeds re: the HJB equation (its derivation takes up two whiteboards), but I only mention it to loan context to how breakthrough Pontryagin's Maximum Principle was. The HJB equation essentially takes an optimization problem and yields a necessary and sufficient condition a solution must fulfill in order to be considered optimal...unfortunately, it is in the form of a nonlinear, partial differential equation. The problem w/ this is that solving nonlinear partial differential equations is HARD; back in the 50s, the computational resources were simply not available to accomplish this.

Pontryagin's Principle, instead, is far more computationally efficient and admits the necessary condition that a solution must fulfill in order to be considered optimal. "What the hell are you talking about now? Optimization problems? Necessary conditions? Solutions? I thought this was the historical background section?". Yes, yes dear reader. I'm getting a little bit ahead of myself here, aren't I? I'll clarify things very soon. Back to Pontryagin. His discovery, in summary, offered a new methodology that allowed mathematicians to tackle optimization problems that were before far too unwieldly to undertake, and is considered pivotal in helping the Soviets launch Sputnik into orbit in 1957. Equally as important, it had taken the field of modern optimal control and formalized it as a discipline in its own right, removing its regard as a mere offshoot of calculus of variations mathematics.


Mathematics: The Optimal Control Problem

Christ, so where to begin...I suppose I should define for you in both mathematical terms and plain English what exactly an "optimal control problem" is. Okay, let's get to it. Consider the following gibberish:

\begin{equation} \textit{State:} \quad \textbf{x}:= [x_1, x_2, ... x_n] \quad \textbf{x} \in \mathbb{R}^n \tag{1}\end{equation}
\begin{equation} \textit{Control:} \quad \textbf{u}:= [u_1, u_2, ... u_m] \quad \textbf{u} \in \mathbb{R}^m \tag{2}\end{equation}
\begin{equation} \textit{Minimize:} \quad J[x(\cdot), u(\cdot), t_f] = E[x(t_f), t_f] + \int_{t_0}^{t_f} F[x(t), u(t), t] dt \tag{3}\end{equation}
\begin{equation} \textit{Subject to:} \quad \dot x = f(x(t), u(t), t) \tag{4}\end{equation}
\begin{equation} \  \qquad x(t_0) = x^0 \tag{5}\end{equation}
\begin{equation} \quad t_0 = t^0 \tag{6}\end{equation}
\begin{equation} \quad \qquad e(x_f, t_f) = 0 \tag{7}\end{equation}
\begin{equation} \quad \qquad \qquad \qquad \qquad \qquad h(x(t), u(t), t) = h(x(t), u(t), t) \tag{8}\end{equation}

This is the standard, generic form of an optimal control problem. Line-by-line:
  1. This is where you declare your state vector. A state vector is simply a vector (think of it as a list) of variables that describe your system. You can have any number of states as you want (hence 'n') and they assume values that are real numbers (hence the fancy R). An example of a state may be x-position, y-position, z-position, velocity, angular rate, mass, voltage, fuel, temperature, pressure, etc. Generically, these are denoted by 'x' and a subscript. In practice, however, we use intuitive denominations for state variables, such as x for x-position, y for y-position, v for velocity, etc.

  2. This is where you declare your control vector. Same rules apply here as your state vector, but the difference is that a control vector describes variables of your system that you're able to manipulate. So this can be something like thrust, commanded voltage, commanded acceleration, jerk command, etc. Generically, these are denoted by 'u' and a subscript. In practice, I like to stick to this convention, but it doesn't have to be that way. In reality, you can label these variables whatever you please, so long as you can keep track of them and the math works out.

  3. Cost function or objective function. This is usually denoted by a J; all the stuff in brackets just means that it is dependent on your states, your controls, and the final time. Notice that there are two parts to the cost function: the E part, which denotes your terminal cost or Mayer cost (notice how it is dependent on your states and final time) and the F part, which denotes your running cost or Lagrangian cost (notice it contains an integral; think of an integral as a continuous summation over time). The overarching goal of optimal control problems is to minimize/maximize the objective function. A ton of consideration goes towards this, so we'll expand on this later.

  4. Dynamics! f() just means that this is a system of differential equations, usually dependent on, you guessed it, your states, controls, and time. These equations describe the behavior of your system, how they evolve over time. Usually they're presented in their state-space representation. We'll go over this later too.

  5. Initial conditions. As stated before, in post 20210714, you need initial conditions to solve differential equations. Here is where you declare them, one for each of your state variable. We all have to start somewhere, right?

  6. Initial time. We all have to start at some time too.

  7. Endpoint conditions. These are conditions, expressed as mathematical equations, that must be fulfilled in the problem. An example could be something like, "My final position along the x-axis should be 2000 meters." or "My final velocity should be zero.". There is a standard way to represent these conditions, which is why they're set to zero, but it isn't terribly important to elaborate on right now.

  8. Path constraints. These constraints, well...they constrain your problem. They're usually expressed as inequalities and are imposed on your state and/or control variables. So, for example, let's say that an automobile you're modeling is capable of a top speed of 217 mph. You might want to impose a path constraint on your velocity, specifying that it is not to exceed 217 mph. Imposing path constraints on your problem is a quick way to introduce complexity.
In plain English, the optimal control problem is as follows: given a set of state and control variables that evolve according to a specified set of dynamics, find the control variable trajectories that minimize/maximize the objective function such that all of the declared conditions are met and none of the constraints are violated. Easy-peasy.


Kind Words Do Not Cost Much

Okay, I have to admit here that there's a small nagging concern of mine. That J thing I outlined in Eq. 3 above, the cost function? That isn't really a function but a functional. As I explained in post 20210621, a function takes a scalar (just a number) and maps it to another number. A functional, by contrast, takes functions and/or vectors and maps them to a scalar. I like to conceptualize it as 'scoring' a group of functions/vectors. Hence why the cost function is also sometimes referred to as a performance criterion or performance measure. I don't know why it's still referred to as a cost function...actually, now that I think about it, it may be b/c a similar concept is used in machine learning (the loss function), which also crosses over into optimization from time-to-time. Digressions aside, the cost function is ostensibly the central crux of an optimal control problem. All optimal control problems and candidate solutions thereof concern themselves w/ its minimization or maximization. Fundamentally, there are four types of 'basic' optimal control problems w/ their corresponding cost functions:

  1. Minimum Time. Bring the system from its starting point to a specified endpoint in minimum time. For example, bring a vehicle from point A to point B in the shortest time possible.
    \begin{equation}J[t_f] = t_f \tag{9}\end{equation}
  2. Minimum Control Effort. Bring the system from its starting point to a specified endpoint w/ the least amount of energy/fuel/control input expenditure. Bring a vehicle from point A to point B burning the least amount of fuel possible. The u here is squared b/c may be allowed to assume negative values; squaring it allows us to account for those negative values.
    \begin{equation}J[u(\cdot), t_f] = \int_{t_0}^{t_f} u^2(t) \ dt \tag{10}\end{equation}
  3. Terminal State. Bring the system from its starting point to a desired endpoint r(tf) w/ the least amount of deviation possible. Bring a vehicle from point A to point B w/ the least amount of deviation from point B as possible. The || signifies that we're taking the norm (the magnitude) of a vector. x(tf) and r(tf) are vectors. We're squaring the difference too, for the same reasons explained above.
    \begin{equation}J[x(t_f), t_f] = ||x(t_f) - r(t_f)||^2 \tag{11}\end{equation}
  4. Tracking. Maintain the system states as close as possible to the desired state r(t) over the time interval [t0, tf]. Bring a vehicle from point A to point B w/ the least amount of deviation from the specified path as possible.
    \begin{equation}J[x(\cdot), t_f] = \int_{t_0}^{t_f}||x(t) - r(t)||^2 \ dt \tag{12}\end{equation}
Notice that all of these cost functions concern themselves w/ minimization. To switch it to maximization is easy enough; just add a negative sign. Also, note that there's nothing to stop you from chaining all or a subset of these cost functions together. Consider the following example:

Figure 3. Ho229 tracking problem.

Suppose we want to take the Ho229 unit from point A, set at coordinates (0, 0) to point B, set at coordinates (5, 3). Suppose we want to accomplish this in minimum time, w/ minimal fuel expenditure, all while ensuring we stick to the specified, solid path r(t). Our cost function would look something like this:

\begin{equation}J[x(\cdot), u(\cdot), t_f] = t_f + \int_{t_0}^{t_f} u^2(t) + (x(t) - r(t))^2 \ dt  \tag{13}\end{equation}

Now, think this through term-by-term. The longer we take to get from point A to point B, the larger tf is. The larger tf is, the larger J ultimately is. So to minimize J overall, it is in our interest to get to point B quickly. Similarly, the more fuel we burn, the larger that squared u term becomes, making J larger. Therefore it is in our interest to keep u at a minimum. Lastly, the further we deviate from the solid path (our actual path flown is the dashed line), the greater the difference between the two becomes, and the larger J gets. Therefore it is in our interest to minimize deviation from the solid path. As you can see, cost functions can be as simple or as complex as you'd want them to be, throwing in any combination of state or control variables to minimize/maximize. Thinking about it all...sheesh...I'm so smitten at how simple and elegant cost functions are in parameterizing things that might be considered abstract notions.

I was going to elaborate on another important, ostensibly the most important, part of optimal control: the dynamics, but I think I will save that for another post. This is enough explanation for now, and I'd like to take care not to overwhelm w/ too much information. See you guys in the next one.

[end transmission 4/?]