Notebook

Stabilising an Inverted Pendulum on a Cart¶

Examples - Mechanics¶

By Eilif Sommer Øyre, Niels Henrik Aase and Jon Andreas Støvneng

Last edited: April 20th 2019

Introduction¶

The inverted pendulum is a classic controllability example. Most likely, you have performed this kind of control yourself by balancing a broom or stick on your palm. In that case the pendulum was a physical pendulum and your arm the external driving force. In this notebook we will try to balance an inverted mathematical pendulum on a cart subjected by an external force acting on the cart's center of mass.

Figure 1: MIT Professor Alan V. Oppenheim in his lecture The Inverted Pendulum from the course Signals and Systems. Source: Oppenheim's lecture notes [1].

The figure above shows a common experimental setup for controlling an inverted pendulum. The pendulum is moved by pulling a suspended wire between the cart and an electrical motor. In this particular setup, the controller will balance the pendulum independently of the mass of the pendulum "bead", explaining the confidence when pouring more liquid into the glass.

Initially, we will model our pendulum system using Lagrangian mechanics and later control the pendulum using proportional control and then construct an algorithm to optimise the regulation based on quadratic weighing factors (linear quadratic regulator [LQR])

In [6]:

import numpy as np
import progressbar
import matplotlib.pyplot as plt
from matplotlib import animation
from IPython.display import display, Image
from scipy import signal
import control
%matplotlib inline

Theory Part I¶

Figure 2 below shows a schematic of the system used in this example. The inverted pendulum is fixed in one end on the top of a cart with a mass $M$. The rotation around the fixed point is assumed frictionless. The bead in the end of the pendulum has a mass $m$, while the stiff rod connecting the bead to the cart is massless. This makes the inverted pendulum a mathematical pendulum. The cart is able to move relative to the surface, i.e. only in the $x$-direction. Wheels are only included for illustration. The forces acting on the bead is the conservative gravitational force $\vec{G}$ in the negative $y$-direction, while the force acting on the cart is the external applied force $\vec{u}$ in the $x$-direction. The force $\vec{F}$ is an approximation of the friction on the entire system, also acting in the $x$-direction. The length of the massless rod is assigned $l$, and the angular displacement of the pendulum relative to $\vec{y}$ is given by $\theta$.

Figure 2: A schematic on the inverted pendulum. The bead angle $\theta$ is zero at the unstable (upright) equilibrium. The gravitational force, $\vec{G}$, acting on the bead with mass $m$, the friction on the system $\vec{F}$ and external force $\vec{u}$ is illustrated.

In [7]:

figSizeX = 12
figSizeY = 9

# System parameters
m = 0.1          # mass of bead  [kg]
M = 1            # mass of cart  [kg]
l = 0.2          # length of rod [m]
g = 9.81         # gravitational acceleration [m s^-2]
mu = 10          # friction coefficient [kg s^-1]

# Initial conditions of generalised coordinates
x0_0 = 0         # cart position [m]
x1_0 = 0         # rod angle [radians]
x2_0 = 0         # cart velocity [m s^-1]
x3_0 = 0         # rod angular velocity [radians s^-1]
u_0 = 0          # initial cart force  [kg m s^-2]

The system parameters above can be altered to get different motion and response of the system. As example, a more massive bead would require more external force to be balanced, and a longer rod would reduce the pendulum swing period.

Equations of motion¶

It is possible to derive the equations of motion using Newtonian forces, but the Lagrangian formulation is more elegant. A full derivation of the equations from the Lagrangian $$ L = T - U $$ is found in appendix A.

The system sketched in figure 2 has two holonomic constraints: The rod length $l$ is fixed, and the $y$-coordinate of the cart is always zero. This results in two generalised coordinates: The bead angle $\theta$, the cart position $x$ and their time derivatives $\dot{\theta}$ and $\dot{x}$. The corresponding equations of motion evaluates to

\begin{equation} \ddot{x} = \lambda(\theta)\big[mgl^2\dot{\theta}^2\sin\theta - mgl \sin\theta \cos\theta - l\mu \dot{x} + ul \big] \label{cartacc} \end{equation}\begin{equation} \ddot{\theta} = \lambda(\theta)\big[(m + M)g \sin\theta - mgl\dot{\theta}^2 \sin\theta \cos\theta + \mu\dot{x}\cos\theta - u\cos\theta \big] \label{angelacc} \end{equation}

where

\begin{equation} \lambda(\theta) = \frac{1}{l(M + m\sin^2\theta)} \label{lambda} \text{.} \end{equation}

The friction coefficient $\mu$ originates from the friction force $\vec{F}$, which is set to be proportional to the cart velocity, $F = -\mu\dot{x}$.

By renaming the generalized coordinates we get the state vector:

\begin{equation} \textbf{x} = \begin{bmatrix} x_0 \\ x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} x \\ \theta \\ \dot{x} \\ \dot{\theta} \end{bmatrix} \text{.} \end{equation}

Then, by realising that $\dot{x_0} = x_2$ and $\dot{x_1} = x_3$, we get the state equation:

\begin{equation} \dot{\textbf{x}} = \frac{d}{dt} \begin{bmatrix} x_0 \\ x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} x_2 \\ x_3 \\ \lambda(x_1)\big[ml^2{x_3}^2\sin x_1 - mgl \sin x_1 \cos x_1 - l\mu x_2 + ul \big] \\ \lambda(x_1)\big[(m + M)g \sin x_1 - ml{x_3}^2 \sin x_1 \cos x_1 + \mu x_2\cos x_1 - u\cos x_1 \big] \end{bmatrix} \text{.} \label{stateEquation} \end{equation}

The last two equations in the state vector are non-linear. Which means that the motion of the system is very hard to regulate. Consequently, we will have to linearise them. This is done in part II of the notebook.

Implementation - Part I¶

Simulating the system¶

To simulate the inverted pendulum on a cart we have to solve the state equation iteratively. In this notebook we will use the numerical method Runge-Kutta fourth order (RK4) to integrate the system of non-linear differential equations, \eqref{systemEquation}. If you have never heard of this method, or want a recap, check out this notebook for a practical introduction.

In addition to defining the RK4 step function, we define a function that returns the right hand side of \eqref{stateEquation} given a vector $\textbf{x}$ and the system parameters, and a function to calculate the systems potential and kinetic energy.

In [20]:

def RHS(z, RHSargs): 
        """Given a specified state vector and system parameters,
        z and RHSargs respectively, this function returns the 
        state vector element's time derivative, i.e. the right 
        hand side of the state equations.
        Parameters:
            z        4x1 array
            RHSargs  6x1 array
        Returns:
            dxdt     4x1 array
        """
        m, M, l, g, mu, u = RHSargs
        x0, x1, x2, x3 = z
        dx0dt = x2
        dx1dt = x3
        lambdax1 = 1/(l*(M + m*np.sin(x1)**2))
        
        dx2dt = lambdax1*(m*l*l*x3**2*np.sin(x1) - m*g*l*np.sin(x1)*np.cos(x1)
                            - l*mu*x2  + u*l)
        dx3dt = lambdax1*((m + M)*g*np.sin(x1) - m*l*x3*x3*np.sin(x1)*np.cos(x1)
                            + mu*x2*np.cos(x1) - u*np.cos(x1))
        
        dxdt = np.array([dx0dt, dx1dt, dx2dt, dx3dt])
                               
        return dxdt
                               

def RK4Step(z, dt, RHS, RHSargs):
        """ Performs one step of the 4th order Runge-Kutta method
        z is the current state vector, dt is the numerical time
        step, RHS is a function for the right hand side of the
        system of differential equations, and RHSargs is the input
        parameters for that function. RK4step returns the next
        state vector, i.e. the values of z after a time dt.
        Parameters:
            z         4x1 array
            dt        float
            RHS       function
            RHSargs   6x1 array
        Returns:
            The state vactor, z, after a time, dt.
        """
        k1 = np.array(RHS(z, RHSargs))
        k2 = np.array(RHS(z + k1*dt/2, RHSargs))
        k3 = np.array(RHS(z + k2*dt/2, RHSargs))
        k4 = np.array(RHS(z + k3*dt, RHSargs))
        
        return z + dt/6*(k1 + 2*k2 + 2*k3 + k4)

In [21]:

def addEnergy(z):
    """ Given the the state vector, z, calculates and returns the kinetic
    and potential energy of the system in an arrray. See appendix B for the
    energy term formulas.
    """
    x0, x1, x2, x3 = z
    U = m*g*l*np.cos(x1)
    T = 0.5*x2**2*(m + M) + 0.5*m*x3**2*l**2 + m*x2*x3*l*np.cos(x1)
    return np.array([T, U])

Now, we can integrate the state equation after defining a step size, $\Delta t$, and the integration duration $t_{max}$. The matrix Z is initialised to contain the state vector at any time $t$. While the initial angle of the pendulum is zero, it will never fall down because the x-component to the gravitational force is excactly zero. So, to make the situation more realistic, a random noise is added as external force, $u$, in every iteration. The noise could correspond to e.g. a wind gust (isolated on the cart).

In [26]:

duration = 10                # Integration duration [sec]
timeStep = 1e-2              # Time step 
n = int(duration/timeStep)   # Number of datapoints

# Initialising matrices
RHSargs = np.array([m, M, l, g, mu, u_0]) # System parameters
Z = np.zeros((4, n))                      # Z = [[x0_0, ... x0_n], [x1_0, ... x1_n], [x2,...,], [x3,...,]
# Inserting initial state vector into Z
Z[:, 0] = np.array([x0_0, x1_0, x2_0, x3_0])
# Initialising matrix containing the noise and potential force on the cart
extForce = np.zeros((2, n))
extForce[0, 0] = u_0

# Initialising matrix containing the kinetic and potential energy of the system
energy = np.zeros((2, n))    # energy = [[T_0 ... T_n], [U_0, ... U_n]]
energy[:, 0] = addEnergy([x0_0, x1_0, x2_0, x3_0])

bar = progressbar.ProgressBar()
for i in bar(range(n - 1)):
    # Compute and insert the next state vector using RK4
    Z[:, i + 1] = RK4Step(Z[:, i], timeStep, RHS, RHSargs)
    # Generating a random noise value
    noise = np.random.uniform(-0.001, 0.001)
    extForce[1, i +1] = noise
    # Updating the system parameter u
    RHSargs[5] = noise
    # Add the energy terms
    energy[:, i + 1] = addEnergy(Z[:, i + 1])

100% (999 of 999) |######################| Elapsed Time: 0:00:00 Time:  0:00:00

Next, we make a phase-space plot of the generalised coordinates in addition to the evolution to the state vector. The definition of the plotting function is found in Appendix B.

In [27]:

plotData(Z, extForce, energy, appliedForce=False)

The phase-space plot of $\theta$ reveals a movement similar to a simple pendulum with friction. It spirals in towards zero angular velocity and an angle $\theta = \pi$. This point would correspond to the pendulum hanging downward at rest, which is the intuitive result if no force tries to maintain the inverted form. The two lowermost plots shows how the total energy of the system reduces with time. This is due to the friction force ($\vec{F}$ in figure 2) working in the opposite direction of the cart's velocity. The bigger the friction coefficient $\mu$, the higher the energy loss. As seen from the time evolution plots, the pendulum stays inverted some time before noise pushes it over. The phase-space path for $q_2(x, \dot{x})$ however has a not so harmonic motion. It seems to experience a high acceleration for a short time two times each periodic motion. Let's make an animation to see better whats going on. The definition of the animation function is found in Appendix C. The animation below was generated using the code:

generateAnimationWithPhaseSpace(Z, 'falling_phsp.gif')

falling

As seen from the animation, the cart slides back and forth due to the pendulum.

Now, in our state equation \eqref{stateEquation} (equations of motion) we have the variable $u$ which let us exert a external force on the cart. In the next part of the notebook we will utilise this variable to try to stabilise the pendulum. But first, the state equation has to be linearised.

Theory Part II¶

Linearising around equilibrium¶

The non-linear equations of motion can be linearised to the form

\begin{equation} \dot{\textbf{x}} = \mathcal{A}\textbf{x} + \mathcal{B}u \end{equation}

around an equilibrium. The inverted pendulum in our example has two equilibrium points. The stable equilibrium, corresponding to a downward hanging pendulum

\begin{equation} \begin{cases} x_0 &= \mathrm{free} \\ x_1 &= \pi \\ x_2 &= 0 \\ x_3 &= 0 \end{cases} . \end{equation}

And the unstable equilibrium, corresponding to the pendulum being carefully balanced upright. It's unstable in the sense that any perturbation from its point, for example noise in the form of a small wind gust, will cause it to fall down.

\begin{equation} \begin{cases} x_0 &= \mathrm{free} \\ x_1 &= 0 \\ x_2 &= 0 \\ x_3 &= 0 \end{cases} . \end{equation}

The latter is the equilibrium we are interested in.

The matrices $\mathcal{A}$ and $\mathcal{B}$ is given by the Jacobian Matrix,

\begin{equation} \mathcal{J} = \big [ \frac{\partial \textbf{f}}{\partial x_1} \cdots \frac{\partial \textbf{f}}{\partial x_n} \big ] = \begin{bmatrix} \frac{\partial f_0}{\partial x_0} & \cdots & \frac{\partial f_0}{\partial x_n} \\ \vdots & \ddots & \vdots \\ \frac{\partial f_n}{\partial x_0} & \cdots & \frac{\partial f_n}{\partial x_n} \end{bmatrix} \end{equation}

differentiated with respect to the variables $x$ and $u$, respectively [2]. The derivatives are then evaluated at an equilibrium point. Here $f_i$ is the components of the state equation $\eqref{stateEquation}$

\begin{equation} \dot{\textbf{x}} = \frac{d}{dt} \begin{bmatrix} x_0 \\ x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} f_0 \\ f_1 \\ f_2 \\ f_3 \end{bmatrix} = \begin{bmatrix} x_2 \\ x_3 \\ \lambda(x_1)\big[ml^2{x_3}^2\sin x_1 - mgl \sin x_1 \cos x_1 - l\mu x_2 + ul \big] \\ \lambda(x_1)\big[(m + M)g \sin x_1 - ml{x_3}^2 \sin x_1 \cos x_1 + \mu x_2\cos x_1 - u\cos x_1 \big] \end{bmatrix} \text{.} \end{equation}

When calculating the Jacobian matrix at the unstable equilibrium, every coordinate except $x_0$ goes to zero after differentiating. This gives

\begin{equation} \mathcal{A} = \begin{bmatrix} \frac{\partial f_0}{\partial x_0} \big |_{0} & \frac{\partial f_0}{\partial x_1} \big |_{0} & \frac{\partial f_0}{\partial x_2} \big |_{0} & \frac{\partial f_0}{\partial x_3} \big |_{0} \\ \frac{\partial f_1}{\partial x_0} \big |_{0} & \frac{\partial f_1}{\partial x_1} \big |_{0} & \frac{\partial f_1}{\partial x_2} \big |_{0} & \frac{\partial f_1}{\partial x_3} \big |_{0} \\ \frac{\partial f_2}{\partial x_0} \big |_{0} & \frac{\partial f_2}{\partial x_1} \big |_{0} & \frac{\partial f_2}{\partial x_2} \big |_{0} & \frac{\partial f_3}{\partial x_3} \big |_{0} \\ \frac{\partial f_3}{\partial x_0} \big |_{0} & \frac{\partial f_3}{\partial x_1} \big |_{0} & \frac{\partial f_3}{\partial x_2} \big |_{0} & \frac{\partial f_3}{\partial x_3} \big |_{0} \end{bmatrix} = \begin{bmatrix} 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & \frac{-mg}{M} & \frac{-\mu}{M} & 0 \\ 0 & \frac{g(m + M)}{lM} & \frac{\mu}{ML} & 0 \end{bmatrix} \end{equation}

and

\begin{equation} \mathcal{B} = \begin{bmatrix} \frac{\partial f_0}{\partial u} \big |_{0} \\ \frac{\partial f_1}{\partial u} \big |_{0} \\ \frac{\partial f_2}{\partial u} \big |_{0} \\ \frac{\partial f_3}{\partial u} \big |_{0} \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \\ \frac{1}{M} \\ \frac{-1}{Ml} \end{bmatrix} \text{.} \end{equation}

One could perform a test to show that the system is controllable with the matrices $\mathcal{A}$ and $\mathcal{B}$, but in our case we will assume it is.

Now, by defining the controller

$$ u = -\mathcal{K}\textbf{x}, $$

we get the linearised form

\begin{equation} \dot{\textbf{x}} = (\mathcal{A} - \mathcal{BK})\textbf{x}, \label{controller} \end{equation}

where $\mathcal{K}$ is the gain matrix. It is a proportional gain as the control variable (the external force $u$) is proportional to the state vector $\text{bf}$. This allows us to choose a $\mathcal{K}$ such that the eigenvalues of the matrix in the parenthesis are all stable, in essence negative (which indicate negative feedback) [1, 3].

Implementation Part II¶

Proportional gian controller¶

In this part we do a similar implementation as above, in addition to calculate the gain matrix. Then, by adding the control force in each integration step, a feedback loop is created. To retrieve a $\mathcal{K}$ matrix such that the eigenvalues of $(\mathcal{A} - \mathcal{BK})$ becomes the wanted eigenvalues poles, scipy's signal.place_poles package is utilised. This package simply returns the gain matrix given the poles, $\mathcal{A}$ and $\mathcal{B}$.

In [33]:

# Redefining the initial rod angle in such a way 
# that it starts out by falling
x1_0 = 0.2

# Defining the setpoints. I.e. the values of the 
# state vector needed for stabilasion
# x0 can be choosed freely, we choose -0.2
setpoints = [-0.2,   
                0,
                0,
                0]

# Defining the control matrices A and B
A = np.array([[0, 0, 1, 0],
            [0, 0, 0, 1],
            [0, -m*g/M, -mu/M, 0],
            [0, (m+M)*g/(M*l), mu/(M*l), 0]])
                    
B = np.array([[0], 
            [0], 
            [1/M],
            [-1/(M*l)]])

# Specifying wanted eigenvalues of the matrix (A - B*K)
# Subscript PG stands for Proportional Gain
poles_PG = [-1.3, -1.4, -1.5, -1.6]
# Get the full state feedback bunch object
fsf = signal.place_poles(A, B, poles_PG)
# Extract the gain matrix
K_PG = fsf.gain_matrix

def controller(z, K):
    """ Given the current values of the state vactor, z
    and the gain matrix K, it returns the current control
    input, u.
    """
    offset = z - setpoints
    u = -np.dot(K, offset)
    return u[0]


def integrateControlled(K, pushForce, pushTime):
    """ This is the integration process implemented earlier,
    but with controll input added to the force. The possibility
    for a 'push' on the pendulum is also added.
    """
    # Re-initialising matrices
    RHSargs = np.array([m, M, l, g, mu, u_0])
    # Z = [(x0_0, x0_1, ..., x0_n), (x1_0, x1_1, ..., x1_n), (x2, ...,), (x3, ...,)]
    Z = np.zeros((4, n))
    Z[:, 0] = np.array([x0_0, x1_0, x2_0, x3_0])
    # Array containing the external forces at each time step
    extForce = np.zeros((2, n))
    extForce[0, 0] = u_0

    # Initialising matrix containing the kinetic and potential energy of the system
    energy = np.zeros((2, n))    # energy = [[T_0 ... T_n], [U_0, ... U_n]]
    energy[:, 0] = addEnergy([x0_0, x1_0, x2_0, x3_0])

    bar = progressbar.ProgressBar()
    for i in bar(range(n-1)):
        Z[:, i + 1] = RK4Step(Z[:, i], timeStep, RHS, RHSargs)
        
        # "Push"
        #  Non-continuously increasing/reducing the angular velocity
        if i == pushTime and i>0:
            Z[3, i + 1] += pushForce
            print("Pushed!")
        
        noise = np.random.uniform(-0.01, 0.01)
        extForce[1, i +1] = noise
        # Calculate the control variable
        force = controller(Z[:, i + 1], K)
        # Store the force, u
        extForce[0, i + 1] = force
        # Adding the total force and updating the parameters
        f = noise + force
        RHSargs[5] = f
        # Adding energy terms
        energy[:, i + 1] = addEnergy(Z[:, i + 1])
        
    return Z, extForce, energy

Z_PG, extForce_PG, energy_PG = integrateControlled(K_PG, 0, 0)

100% (999 of 999) |######################| Elapsed Time: 0:00:00 Time:  0:00:00

In [34]:

plotData(Z_PG, extForce_PG, energy_PG, appliedForce=True)

And the pendulum is balanced! The state vector elements oscillates around the setpoints before stabilising. The initial points are on positive slopes in phase-space, which indicates an unstable initial position. At the final point the the slope is negative (at $\theta = 0$) which is made possible by the control force.

Linear Quadratic Regulator (LQR)¶

We have now been able to stabilise the pendulum on its upright equilibrium, but while doing so we choose some random negative eigenvalues (poles) of $(\mathcal{A} - \mathcal{BK})$. The eigenvalues can be adjusted even more negative, resulting in a more powerful regulation. However, in an engineering situation, one will have a finite maximum aplicable force $u$, other physical limitations, and may want some of the coordinates to stabilise quicker than the others. Thus, it would be attractive to optimise the control. The cost function [4]

\begin{equation} J(u) = \int^{\infty}_0 \big(\textbf{x}^{T}\mathcal{Q}\textbf{x} + u^{T}Ru)dt \label{CostFunction} \end{equation}

is a sum of the deviations of the state vector elements from the set points and the applied force. By minimising this function with weighing factors supplied by the designer ($\mathcal{Q}$ and $R$), one can find the poles which minimise the undesired deviations. The value of the function can be regarded as a penalty. If the system is not stabilised quickly, the first term is going to be large. If the controller spends a lot of energy stabilising, the second term will be large. Say e.g. we dislike having a high angular velocity on our system, and have a beefy motor with cheap electricity, the weighing factors may be defined

\begin{equation} \mathcal{Q} = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 10 \end{bmatrix} , \qquad R = 0.1 \end{equation}

The python package control contains a function lqr which optimises the cost function for us, and simply return the gain matrix $\mathcal{K}$ corresponding to our weighing factors. It has the same functionality as the MATLAB function lqr [5].

In [47]:

# Weighing factor for the state vector
Q = np.array([[1, 0, 0, 0],
              [0, 1, 0, 0],
              [0, 0, 1, 0],
              [0, 0, 0, 10]])
# Weighing factor for the external force
R = 0.1

# Find the gain matrix and poles using lqr-function
K_LQR, s, poles_LQR = control.lqr(A, B, Q, R)

# Integrate and retrieve the data
Z_LQR, extForce_LQR, energy_LQR = integrateControlled(K_LQR, 0, 0)
plotData(Z_LQR, extForce_LQR, energy_LQR, appliedForce=True)

100% (999 of 999) |######################| Elapsed Time: 0:00:00 Time:  0:00:00

The code

generateAnimation(Z_PG, Z_LQR, "lqr_controlled.gif")

generates the animation below. See appendix C for the definition of the function.

controlled

From the plots and amimation above we see a great improvement in stabilisation time. The angular velocity reaches the set point after two seconds while the controller spends a longer time stabilising the position and velocity of the cart, reflecting the weighing factors. There is also a lot less movement in the system with LQR. This is a direct result of "punishing" deviations.

Let's give the pendulum a little "push" after 4 seconds and see what happens. With a time step of 0.01 seconds, the push is timed at iteration number 400.

In [48]:

Z_LQR_pushed, extForce_LQR_pushed, energy_LQR_pushed = integrateControlled(K_LQR, .5, 400)
plotData(Z_LQR_pushed, extForce_LQR_pushed, energy_LQR_pushed, appliedForce=True)

100% (999 of 999) |######################| Elapsed Time: 0:00:00 Time:  0:00:00

Pushed!

We see that the pendulum is quickly re-stabilised after the push which dramatically (discontinuously) alter the slope of the phase-space path.

Resources and Further Reading¶

[1] Prof. Alan V. Oppenheim, Demonstration from lecture notes 26, Feedback example: The Inverted Pendulum, from the MIT course Signals and Systems. Assessed January 31st 2019. Click here for Lecture notes. Click here for video lecture.

[2] Packard, Polo, Horowitz, 2002, Jacobien Linearizations, equilibrium points, from Dynamic Systems and feedback. Assessed March 8th.

[3] Davis, 9.3. Equilibrium: Stable or Unstable?, Department of mathematics, University of Texas at Austin. Assessed March 8th.

[4] Wikipedia has a nice description of LQR and briefly illuminates its application potentials.

[5] MathWorks, documentation of the function lqr.

[6] Prof. Jerrold D. Marsden, 1 Lecture: Lagrangian Mechanics, Caltech, November 21st 2006. Assessed February 4th 2019.

Appendix A: Derivations of Equations of motion¶

The following equations of motion correspond to the system in figure 2. Firstly, an expression for the kinetic energy and the potential energy, $T$ and $U$, of the system is found. Then, from the Lagrangian $L = T - U$, the equations of motion are computed using the Euler-Lagrange equation

\begin{equation} \frac{d}{dt} \frac{\partial L}{\partial \dot{q}_i} - \frac{\partial L}{\partial q_i} = 0 \text{.} \label{EL} \end{equation}

Since the rod is massless, the kinetic energy of the system is the sum of the kinetic energy of the cart and the bead,

$$ T = \frac{1}{2}M v^{2}_c + \frac{1}{2}m v^{2}_b .$$

The cart is fixed in the y-axis,

$$ v^2_c = \dot{x}_c^2 + \dot{y}_c^2 = \dot{x}_c^2 .$$

Using the angle of the bead, $\theta$ relative to the y-axis,

\begin{equation} \begin{aligned} x_b &= x_c + l \sin \theta \\ y_b &= l\cos \theta \\ \rightarrow \dot{x}_b &= \dot{x}_c +\dot{\theta}l \cos \theta \\ \dot{y}_b &= -\dot{\theta} l \sin \theta \\ \rightarrow v^2_b &= \dot{x}^2_c + \dot{\theta}^2 l^2 + 2\dot{x}_c \dot{\theta}l \cos \theta \end{aligned} \end{equation}

Thus, with the potential energy

$$ U = mgl\cos \theta ,$$

and rewriting $x_c \rightarrow x$, the Lagrangian becomes

\begin{equation} L = \frac{1}{2} \dot{x}^2(m + M) + \frac{1}{2}m\dot{\theta}^2l^2 + m\dot{x}\dot{\theta}l\cos\theta - mgl\cos\theta \label{L} \end{equation}

Applying \eqref{EL} to the generalized coordinates $q_i = \theta$, we get

\begin{equation} \ddot{\theta} = \frac{g}{l}\sin\theta - \frac{\ddot{x}}{l}\cos\theta . \label{I} \end{equation}

The second generalized coordinate $q_i = x$ is involved in the external force $u$ and the non-conservative force friction. Thus, the Euler-Lagrange equation is replaced by the Lagrange-d'Alambert principle, which is equivalent to the Euler-Lagrange equations with external force [6]:

\begin{equation} \frac{d}{dt} \frac{\partial L}{\partial \dot{q}_i} - \frac{\partial L}{\partial q_i} = F_{ext}(q, \dot{q}) \text{.} \label{ELEF} \end{equation}

This yields the equation of motion

\begin{equation} \ddot{x} = \frac{1}{m+M}\big [ ml\big(\dot{\theta}^2\sin\theta - \ddot{\theta}\cos\theta\big) - \mu\dot{x} + u \big ], \label{II} \end{equation}

where $-\mu\dot{x}$ is an approximation of the friction force on the system ($\mu$ is the friction coefficient), and $u$ is the external force applied to the cart.

On matrix form \eqref{I} and \eqref{II} can be written

\begin{equation} \begin{bmatrix} m+M & ml\cos\theta \\ \cos\theta & l \end{bmatrix} \begin{bmatrix} \ddot{x} \\ \ddot{\theta} \end{bmatrix} = \begin{bmatrix} ml\dot{\theta}^2\sin\theta - \mu\dot{x} \\ g\sin\theta \end{bmatrix} + \begin{bmatrix} 1 \\ 0 \end{bmatrix} u . \end{equation}

Finally, by defining

\begin{equation} \Lambda(\theta) = \begin{bmatrix} m+M & ml\cos\theta \\ \cos\theta & l \end{bmatrix} \end{equation}

the resulting equations of motion becomes

\begin{equation} \begin{bmatrix} \ddot{x} \\ \ddot{\theta} \end{bmatrix} = \Lambda(\theta)^{-1} \begin{bmatrix} ml\dot{\theta}^2\sin\theta - \mu\dot{x} \\ g\sin\theta \end{bmatrix} + \begin{bmatrix} 1 \\ 0 \end{bmatrix} u , \label{EQM} \end{equation}

where $\Lambda(\theta)^{-1}$ is the inverse of $\Lambda(\theta)$,

\begin{equation} \Lambda(\theta)^{-1} = \frac{1}{l(M+m\sin^2\theta)} \begin{bmatrix} l & -ml\cos\theta \\ -\cos\theta & m+M \end{bmatrix}. \end{equation}

Appendix B - Definition of `plotData`¶

In [17]:

def relative(data):
    """Returns the values of 'data' relative to the absolute maximum value of 'data'."""
    return data/np.amax(np.absolute(data))
        
        
def angle(data):
    """Maps the values of data on to the the interval [1, -1]"""
    copy = np.zeros(data.shape)
    for i in range(len(data)):
        if data[i] <= np.pi:
            copy[i] = (np.pi - data[i])/np.pi
        else:
            copy[i] = -(data[i] - np.pi)/np.pi
    return copy

In [46]:

def plotData(Z, extForce, energy, appliedForce):
        """ This function plots the results in time-evolution, and phase-space plots.
        It differ between two functionalities, given by the boolean 'appliedForce'.
        Arguments:
                Z              4xN array. The state vector component-values at each timestep
                extForce       2xN array. The value of external force and noise at each timestep
                energy         2xN array. The kinetic and potential energy at each timestep
                appliedForce   boolean.
        """
        # Defining an array of the time in seconds at each step
        time = np.arange(0, n*timeStep, timeStep)
        totalF = extForce[0] + extForce[1]
        
        # Plots time evolution of the external force and every state-vector-component if true
        if appliedForce:
            plt.figure(figsize=(12, 6), dpi=200)
            plt.subplot(2, 2, (1, 3))
            plt.title("Time evolution of general coordinates")
            plt.plot(time, np.zeros(n), 'b--',label='Setpoint', linewidth=1)
            plt.plot(time, [setpoints[0]/np.amax(np.absolute(Z[0])) for i in range(n)], 
                     '--', color="black", linewidth=1, label=r"Setpoint for $x$")
            plt.plot(time, relative(Z[0]), label=r'$x$')
            plt.plot(time, relative(Z[1]), label=r"$\theta$")
            plt.plot(time, relative(Z[2]), label=r"$\dot{x}$")
            plt.plot(time, relative(Z[3]), label=r"$\dot{\theta}$")
            plt.plot(time, relative(extForce[0]), label=r"$u$")
            plt.xlabel('time, sec')
            plt.legend(loc='best')
            plt.grid(linestyle=':')
        # Plots time evolution of the cart position and rod angle only if appliedForce == false.
        else:
            plt.figure(figsize=(12, 6), dpi=200)
            plt.subplot(2, 2, (1, 3))
            plt.title("Time evolution of general coordinates")
            plt.plot(time, relative(Z[0]), label=r'$x$')
            plt.plot(time, angle(Z[1]), label=r"$\theta$")
            plt.xlabel('time, A.U')
            plt.legend(loc='best')
            plt.grid(linestyle=':')
        
        # Independent of appliedForce.
        # Generate phase-space plot of the coordinate theta.
        plt.subplot(222)
        plt.title("Rod angle phase-space plot")
        plt.plot(Z[1], Z[3])
        plt.plot(Z[1][-1], Z[3][-1], 'o', label="Final point")
        plt.plot(Z[1][0], Z[3][0], '*', label="Initial point")
        plt.xlabel(r'$\theta$ [rad]')
        plt.ylabel(r'$\dot{\theta}$ [rad/s]')
        plt.legend(loc='best')
        plt.grid(linestyle=":")
        
        # Generates phase-space plot of the coordinate x.
        plt.subplot(224)
        plt.title("Cart position phase-space plot")
        plt.plot(Z[0], Z[2])
        plt.plot(Z[0][-1], Z[2][-1], 'o', label="Final point")
        plt.plot(Z[0][0], Z[2][0], '*', label="Initial point")
        plt.xlabel(r'$x$ [m]')
        plt.ylabel(r'$\dot{x}$ [m/s]')
        plt.legend(loc='best')
        plt.grid(linestyle=":")
        
        # Adjusts horisontal seperation between subplots.
        plt.subplots_adjust(hspace=0.4)
        
        if appliedForce == False:
            # Plots the loss of total energy as a function of time
            plt.figure(figsize=(12, 6), dpi=200)
            plt.subplot(121)
            plt.title("Loss of total energy due to friction")
            # Minimum total energy of the system
            minima = m*g*l*np.cos(np.pi)
            # The y-axis displays enrgy percentage relative to initial total energy
            plt.plot(time, (np.sum(energy, 0) - minima)/(np.sum(energy[:,0], 0) - minima)*100)
            plt.xlabel("time, sec")
            plt.ylabel('Percentage of initial total energy, %')
            plt.grid(linestyle=":")
            
            # Plots the kinetic and potential energy of the system as a function of time.
            plt.subplot(122)
            plt.title("Kinetic and potential energy of the system")
            plt.xlabel("time, sec")
            plt.ylabel("energy, J")
            plt.plot(time, energy[0], label="T")
            plt.plot(time, energy[1], label="U")
            plt.legend(loc='best')
            plt.grid(linestyle=":")

        plt.show()

Appendix C - Definition of `generateAnimation` and¶

`generateAnimationWithPhaseSpace`¶

In [3]:

def cartX(x_1):
    """ Given the value of x_1, i.e. the position
    of the cart, returns the x-coordinates of the
    cart's corners."""
    return np.array([x_1 - l/8, x_1 + l/8, x_1 + l/8, x_1 - l/8, x_1 - l/8])

In [41]:

def generateAnimation(Z, Z2, name):
    """ This function generates two animations of the system as a function of time,
    from the two results, 'Z' and 'Z2'. The two animations are plottet underneath each other.
    Arguments:
            Z      4xN array. State vector component-values at each timestep
            Z2     4xN array. State vector component-values at each timestep
            name   string. Name of the animation file.
    """
    
    # Defining an array of the time in seconds at each step
    timeValues = np.arange(0, n*timeStep, timeStep)
    
    X = Z[0] + l*np.sin(Z[1])                           # x-coordinates of bead
    Y = l*np.cos(Z[1])                                  # y-coordinates of bead
    cartY = np.array([l/16, l/16, -l/16, -l/16, l/16])  # y-coordinates of cart corners
    
    X2 = Z2[0] + l*np.sin(Z2[1])                           # x-coordinates of bead 2
    Y2 = l*np.cos(Z2[1])                                   # y-coordinates of bead 2
    cartY2 = np.array([l/16, l/16, -l/16, -l/16, l/16])    # y-coordinates of cart corners 2
    
    fig = plt.figure(figsize=(figSizeX, figSizeY), dpi=200)
    
    # Animation/result/pendulum number one
    plt.subplot(211)
    # Axis limitations
    xMin = np.min(X) - 1.5*l
    xMax = np.max(X) + 1.5*l
    # Adjust the y-limitations such that a circle looks like a circle.
    # i.e. no scaling.
    xDomain = xMax - xMin
    yDomain = xDomain * 0.5*figSizeY/figSizeX
    plt.xlim(xMin, xMax)
    plt.ylim(-0.5*yDomain, 0.5*yDomain)

    # Defining the different elements in the animation
    surface, = plt.plot([xMin, xMax], [-l/16, -l/16], color='black', linewidth=1)    # The surface
    tail, = plt.plot(X[0], Y[0], '--', color="blue")            # Previous position of the pendulum bead
    cart, = plt.plot(cartX(Z[0, 0]), cartY, color="red")        # The cart 
    rod, = plt.plot([Z[0, 0], X[0]], [0, Y[0]], color="black")  # The massless pendulum rod of length l
    bead, = plt.plot(X[0], Y[0], 'o', color="black", ms=4)      # The pendulum bead
    text = plt.text(xMax-0.1, 0.5*yDomain - 0.1, r'$t =  %.2f$s'%(timeValues[0]), # Text box with elapsed time
            {'color': 'k', 'fontsize': 10, 'ha': 'center', 'va': 'center',
            'bbox': dict(boxstyle='round', fc='w', ec='k', pad=0.2)})

    plt.xlabel('x, m')
    plt.ylabel('y, m')
    plt.title("Proportional Gain controller")
    
    # Animation/results/pendulum number two
    plt.subplot(212)
    # Axis limitations
    xMin2 = np.min(X2) - 3*l 
    xMax2 = np.max(X2) + 3*l 
    xDomain2 = xMax2 - xMin2
    yDomain2 = xDomain2 * 0.5*figSizeY/figSizeX
    plt.xlim(xMin2, xMax2)
    plt.ylim(-0.5*yDomain2, 0.5*yDomain2)

    # Defining the different elements in the animation
    surface2, = plt.plot([xMin2, xMax2], [-l/16, -l/16], color='black', linewidth=1)    # The surface
    tail2, = plt.plot(X2[0], Y2[0], '--', color="blue")              # Previous position of the pendulum bead
    cart2, = plt.plot(cartX(Z2[0, 0]), cartY2, color="red")         # The cart 
    rod2, = plt.plot([Z2[0, 0], X2[0]], [0, Y2[0]], color="black")   # The massless pendulum rod of length l
    bead2, = plt.plot(X2[0], Y2[0], 'o', color="black", ms=4)        # The pendulum bead
    text2 = plt.text(xMax2-0.1, 0.5*yDomain2 - 0.1, r'$t =  %.2f$s'%(timeValues[0]), # Text box with elapsed time
            {'color': 'k', 'fontsize': 10, 'ha': 'center', 'va': 'center',
            'bbox': dict(boxstyle='round', fc='w', ec='k', pad=0.2)})

    plt.xlabel('x, m')
    plt.ylabel('y, m')
    plt.title("Linear Quadratic Regulator")
    
    plt.subplots_adjust(hspace=0.3)
    
    # Calculate the number of frames
    FPS = 30
    framesNum = int(FPS*duration)
    dataPointsPerFrame = int(n/framesNum)

    # Animation function. This is called sequentially
    def animate(j):
        time = j*dataPointsPerFrame
        # Pendulum 1
        tail.set_xdata(X[:time])
        tail.set_ydata(Y[:time])
        cart.set_xdata(cartX(Z[0, time])) 
        cart.set_ydata(cartY)
        rod.set_xdata([Z[0, time], X[time]])
        rod.set_ydata([0, Y[time]])
        bead.set_xdata(X[time])
        bead.set_ydata(Y[time])
        text.set_text(r'$t =  %.2f$s'%(timeValues[time]))
        
        # Pendulum 2
        tail2.set_xdata(X2[:time])
        tail2.set_ydata(Y2[:time])
        cart2.set_xdata(cartX(Z2[0, time])) 
        cart2.set_ydata(cartY2)
        rod2.set_xdata([Z2[0, time], X2[time]])
        rod2.set_ydata([0, Y2[time]])
        bead2.set_xdata(X2[time])
        bead2.set_ydata(Y2[time])
        text2.set_text(r'$t =  %.2f$s'%(timeValues[time]))

    # Create animation
    anim = animation.FuncAnimation(fig, animate, frames=framesNum)

    # Save animation.
    # If this don't work for you, try using another writer 
    # (ffmpeg, mencoder, imagemagick), or another file extension
    # (.mp4, .gif, .ogg, .ogv, .avi etc.). Make sure that you
    # have the codec and the writer installed on your system.
    anim.save(name, writer='pillow', fps=FPS)

    # Close plot
    plt.close(anim._fig)

    # Display the animation
    with open(name, 'rb') as file:
        display(Image(file.read()))

In [30]:

def generateAnimationWithPhaseSpace(Z, name):
    """ This function generates an animation consisting of a plot of the pendulum
    and phase-space plot of the two generalised coordinates x and theta, as a
    functino of time.
    Arguments:
            Z      4xN array. State vector component-values at each timestep
            name   string. Name of the animation file.
    """
    
    X = Z[0] + l*np.sin(Z[1])                           # x-coordinates of bead
    Y = l*np.cos(Z[1])                                  # y-coordinates of bead
    cartY = np.array([l/16, l/16, -l/16, -l/16, l/16])  # y-coordinates of cart corners

    fig = plt.figure(figsize=(figSizeX, figSizeY), dpi=200)
    # Axis limitations
    plt.subplot(2, 2, (1, 3))
    xMin = np.min(X) - l/2 
    xMax = np.max(X) + l/2
    # Adjust the y-limitations such that a circle looks like a circle.
    # i.e. no scaling.
    xDomain = xMax - xMin
    yDomain = xDomain*2*figSizeY/figSizeX
    plt.xlim(xMin, xMax)
    plt.ylim(-0.5*yDomain, 0.5*yDomain)

    # Defining the different elements in the animation
    surface, = plt.plot([xMin, xMax], [-l/16, -l/16], color='black', linewidth=1)    # The surface
    tail, = plt.plot(X[0], Y[0], '--', color="blue")            # Previous position of the pendulum bead
    cart, = plt.plot(cartX(Z[0, 0]), cartY, color="red")        # The cart 
    rod, = plt.plot([Z[0, 0], X[0]], [0, Y[0]], color="black")  # The massless pendulum rod of length l
    bead, = plt.plot(X[0], Y[0], 'o', color="black", ms=4)      # The pendulum bead

    plt.xlabel('x')
    plt.ylabel('y')
    plt.title("Inverted pendulum on cart")
    
    # Phase-space plot of the rod angle theta
    plt.subplot(222)
    plt.title("Rod angle in phase-space")
    anglePath, = plt.plot(Z[1], Z[3])
    angleFinal, = plt.plot(Z[1, -1], Z[3, -1], 'o', label="Final point")
    angleCurrent, = plt.plot(Z[1, 0], Z[3, 0], '*', label="Current point")
    plt.xlabel(r'$\theta$ [rad]')
    plt.ylabel(r'$\dot{\theta}$ [rad/s]')
    plt.legend(loc='best')
    plt.grid(linestyle=":")
    
    # Phase-space plot of the cart position x.
    plt.subplot(224)
    plt.title("Cart position in phase-space")
    cartPath, = plt.plot(Z[0], Z[2])
    cartFinal, = plt.plot(Z[0, -1], Z[2, -1], 'o', label="Final point")
    cartCurrent, = plt.plot(Z[0, 0], Z[2, 0], '*', label="Current point")
    plt.xlabel(r'$x$ [m]')
    plt.ylabel(r'$\dot{x}$ [m/s]')
    plt.legend(loc='best')
    plt.grid(linestyle=":")    
    
    # Calculate the number of frames
    FPS = 30
    framesNum = int(FPS*duration)
    dataPointsPerFrame = int(n/framesNum)

    # Animation function. This is called sequentially
    def animate(j):
        time = j*dataPointsPerFrame
        
        # Updating each element every time 
        tail.set_xdata(X[:time])
        tail.set_ydata(Y[:time])
        cart.set_xdata(cartX(Z[0, time])) 
        cart.set_ydata(cartY)
        rod.set_xdata([Z[0, time], X[time]])
        rod.set_ydata([0, Y[time]])
        bead.set_xdata(X[time])
        bead.set_ydata(Y[time])
        
        angleCurrent.set_xdata(Z[1, time])
        angleCurrent.set_ydata(Z[3, time])
        anglePath.set_xdata(Z[1, :time])
        anglePath.set_ydata(Z[3, :time])
        cartCurrent.set_xdata(Z[0, time])
        cartCurrent.set_ydata(Z[2, time])
        cartPath.set_xdata(Z[0, :time])
        cartPath.set_ydata(Z[2, :time])

    # Create animation
    anim = animation.FuncAnimation(fig, animate, frames=framesNum)

    # Save animation.
    # If this don't work for you, try using another writer 
    # (ffmpeg, mencoder, imagemagick), or another file extension
    # (.mp4, .gif, .ogg, .ogv, .avi etc.). Make sure that you
    # have the codec and the writer installed on your system.
    anim.save(name, writer='pillow', fps=FPS)

    # Close plot
    plt.close(anim._fig)

    # Display the animation
    with open(name, 'rb') as file:
        display(Image(file.read()))

In [ ]: