Delay differential equations in Python

I wrote a very simple and user-friendly method, that I called ddeint, to solve delay differential equations (DDEs) in Python, using the ODE solving capabilities of the Python package Scipy. As usual the code is available at the end of the post :).

Example 1 : Sine

I start with an example whose exact solution is known so that I can check that the algorithm works as expected. We consider the following equation:

y(t) = sin(t) \ \ \ for\ \ \ t < 0

y'(t) = y(t-3\pi/2) \ \ \ for\ \ \ t \geq 0

The trick here is that sin(t-3\pi/2) = cos(t)=sin'(t) so the exact solution of this equation is actually the sine function.

from pylab import *

model = lambda Y,t : Y(t - 3*pi/2) # Model
tt = linspace(0,50,10000) # Time start, end, and number of points/steps
g=sin # Expression of Y(t) before the integration interval 
yy = ddeint(model,g,tt) # Solving

# PLOTTING
fig,ax=subplots(1)
ax.plot(tt,yy,c='r',label="$y(t)$")
ax.plot(tt,sin(tt),c='b',label="$sin(t)$")
ax.set_ylim(ymax=2) # make room for the legend
ax.legend()
show()

dde_sin

The resulting plot compares our solution (red) with the exact solution (blue). See how our result eventually detaches itself from the actual solution as a consequence of many successive approximations ? As DDEs tend to create chaotic behaviors, you can expect the error to explode very fast. As I am no DDE expert, I would recommend checking for convergence in all cases, i.e. increasing the time resolution and see how it affects the result. Keep in mind that the past values of Y(t) are computed by interpolating the values of Y found at the previous integration points, so the more points you ask for, the more precise your result.

Example 2 : Delayed negative feedback

You can select the parameters of your model at integration time, like in Scipy’s ODE and odeint. As an example, imagine a product with degradation rate r, and whose production rate is negatively linked to the quantity of this same product at the time (t-d):

y(t) = 0 \ \ \ for\ \ \ t < 0

y'(t) = \dfrac{1}{1+(\dfrac{y(t-d)}{K})^2} -ry(t) \ \ \ for\ \ \ t \geq 0

We have three parameters that we can choose freely. For K = 0.1, d = 5, r = 1, we obtain oscillations !

from pylab import *

# MODEL, WITH UNKNOWN PARAMETERS
model = lambda Y,t,k,d,r :  1/(1+(Y(t-d)/k)**2) - r*Y(t)

# HISTORY
g = lambda t:0 

# SOLVING
tt = linspace(0,50,10000)
yy = ddeint(model,g,tt,fargs=( 0.1 , 5 , 1 )) # K = 0.1, d = 5, r = 1

# PLOTTING
fig,ax=subplots(1)
ax.plot(tt,yy,lw=2)
show()

dde_negativefeedback

Example 3 : Lotka-Volterra system with delay

The variable Y can be a vector, which means that you can solve DDE systems of several variables. Here is a version of the famous Lotka-Volterra two-variables system, where we introduce some delay d. For d=0 we find the solution of a classical Lotka-Volterra system, and for d non-nul, the system undergoes an important amplification:

\big(x(t), y(t)\big) = (1,2) \ \ \ for\ \ t < 0, \ \ else

x'(t) = 0.5x(t)\big(1-y(t-d)\big)\\   y'(t) = -0.5y(t)\big(1-x(t-d)\big)

from pylab import *

def model(Y,t,d):
    x,y = Y(t)
    xd,yd = Y(t-d)
    return array([0.5*x*(1-yd), -0.5*y*(1-xd)])

g = lambda t : array([1,2])
tt = linspace(2,30,20000)
fig,ax=subplots(1)

for d in [0, 0.2]:
    yy = ddeint(model,g,tt,fargs=(d,))
    # WE PLOT X AGAINST Y
    ax.plot(yy[:,0],yy[:,1],lw=2,label='delay = %.01f'%d)

ax.legend()
show()

dde_lotka

Example 4 : A DDE with varying delay

This time the delay depends on the value of Y(t) !

y(t) = 1,\ \ \ t \leq 0

y'(t) = - y\big(t-3\cos(y(t))^2 \big),\ \ \ t > 0

from pylab import *
model = lambda Y,t:  -Y(t-3*cos(Y(t))**2)
tt = linspace(0,30,2000)
yy = ddeint(model, lambda t:1, tt)
fig,ax=subplots(1)
ax.plot(tt,yy,lw=2)
show()

dde_Ydependent

Code

Explanations

The code is written on top of Scipy’s ‘ode’ (ordinary differential equation) class, which accepts differential equations under the form

model(Y,t) = ” expression of Y'(t) ”

where Y, and the output Y', must be Numpy arrays (i.e. vectors).

For our needs, we need the input Y to be a function of time, more precisely a function that can compute Y(t) at any past or present t using the values of Y already computed. We also need Y(t) to return the value of some function g(t) if t is inferior to some time tc that marks the start of the integration.

To this end, I first implemented a class (ddeVar) of variables/functions which can be called at any time t: for $t<latex t_c$, it will return the value of g(t), and for t>tc, it will look for two already computed values Y_a and Y_b at times t_a<t<t_b, from which it will deduce Y(t) using a linear interpolation. Scipy offers many other kinds of interpolation, but these will be slower and won't support vectors for Y.

Such variables need to be updated every time a new value of Y(t) is computed, so I created a class 'dde' that inherits from Scipy's 'ode' class but overwrites its integration method so that our special function Y is updated after each integration step. Since 'ode' would feed the model with a vector Y (a Numpy array to be precise), which we don't want, we give to the integrator an interface function that takes a Numpy array Y as an argument, but immediately dumps it and calls the model with our special ddeVar variable Y (I hope that was clear 🙂 ).

Ok, here you are for the code

You will find the code and all the examples as an IPython notebook HERE (if you are a scientific pythonist and you don’t know about the IPython notebook, you are really missing something !). Just change the extension to .ipynb to be able to open it. In case you just asked for the code:

# REQUIRES PACKAGES Numpy AND Scipy INSTALLED
import numpy as np
import scipy.integrate
import scipy.interpolate
 
class ddeVar:
    """ special function-like variables for the integration of DDEs """
    
    
    def __init__(self,g,tc=0):
        """ g(t) = expression of Y(t) for t<tc """
        
        self.g = g
        self.tc= tc
        # We must fill the interpolator with 2 points minimum
        self.itpr = scipy.interpolate.interp1d(
            np.array([tc-1,tc]), # X
            np.array([self.g(tc),self.g(tc)]).T, # Y
            kind='linear', bounds_error=False,
            fill_value = self.g(tc))
            
            
    def update(self,t,Y):
        """ Add one new (ti,yi) to the interpolator """
        
        self.itpr.x = np.hstack([self.itpr.x, [t]])
        Y2 = Y if (Y.size==1) else np.array([Y]).T
        self.itpr.y = np.hstack([self.itpr.y, Y2])
        self.itpr.fill_value = Y
        
        
    def __call__(self,t=0):
        """ Y(t) will return the instance's value at time t """
        
        return (self.g(t) if (t<=self.tc) else self.itpr(t))
 


class dde(scipy.integrate.ode):
    """ Overwrites a few functions of scipy.integrate.ode"""
    
    
    def __init__(self,f,jac=None):
		
        def f2(t,y,args):
            return f(self.Y,t,*args)
        scipy.integrate.ode.__init__(self,f2,jac)
        self.set_f_params(None)
        
 
    def integrate(self, t, step=0, relax=0):
		
        scipy.integrate.ode.integrate(self,t,step,relax)
        self.Y.update(self.t,self.y)
        return self.y
        
 
    def set_initial_value(self,Y):
		
        self.Y = Y #!!! Y will be modified during integration
        scipy.integrate.ode.set_initial_value(self, Y(Y.tc), Y.tc)
 
 
 
def ddeint(func,g,tt,fargs=None):
    """ similar to scipy.integrate.odeint. Solves the DDE system
        defined by func at the times tt with 'history function' g
        and potential additional arguments for the model, fargs
    """
    
    dde_ = dde(func)
    dde_.set_initial_value(ddeVar(g,tt[0]))
    dde_.set_f_params(fargs if fargs else [])
    return np.array([g(tt[0])]+[dde_.integrate(dde_.t + dt)
                                 for dt in np.diff(tt)])

Other implementations

If you need a faster or more reliable implementation, have a look at the packages pyDDE and pydelay, which seem both very serious but are less friendly in their syntax.

Advertisements

An analytical solution to a Jewish problem

Maybe you have heard about the article of Khovanova and Radul, Jewish Problems, in which they collect tricky mathemical problems that were alledgedly designed to prevent Jews (to which they were specifically given) from passing the oral entrance exams of Moscow State University . With such a story and such a naïvely equivocal title, you can count on all the comments-section jewishologists of the world to promote your article 🙂

What I found interesting is this article is that, of the 21 problems given, I could answer most of the analytical ones, while I was completely unequipped to solve the geometrical problems. Has geometry always been that hard ? My impression is that, at least here in France, its teaching is slowly being replaced by linear algebra . Maybe it is simply less needed today. Take civil engineering : a beam used to be some kind of parralelepipede, now it is a grid of finite elements !

As an illustration, let us have a look at the fifth problem: solve the equation

\sin^7(x) + \dfrac{1}{\sin^3x} = \cos^7(x) + \dfrac{1}{\cos^3x}.

What kind of math problem do you see here ? (let’s just hope it has nothing to do with geometry !) The authors of the article give a solution based on some transformation of the equations and a few trigonometricks (at some point they use the variable t=sin(x)cos(x) and retransform it into sin(2x)/2). What I first saw when I read the question (and I bet that’s what you saw too if you have my kind of training) was an analytical problem involving the function

f(t) = t^7 + \dfrac{1}{t^3}

for which it is asked to find all the (u,v) which are the sine and cosine of a same angle and verify f(u)=f(v) ! An obvious situation where this will work is when u=v, which leads us to the solutions t = \pi/4 and t = -3\pi/4, which are the only angles whose sine is equal to the cosine. Now, are there any other solutions ? In the end what is asked is whether the function f is bijective, or not, or just enough ! Let us compute its derivative:

f'(t) = 7t^6 - 3 \dfrac{1}{t^4} = \dfrac{7t^{10}-3}{t^4}

This is only positive outside the interval [-\sqrt[10]{\frac{7}{3}} , +\sqrt[10]{\frac{7}{3}}] . We can now sketch the function:

Since u and v are sine and cosine, they will belong to the interval [-1,1]. In red I have represented two non-bijectivity zones of the function on this interval: if u and v are two different numbers of [-1,1] verifying f(u)=f(v), then the two of them must belong to one of these red zones. Now, notice that
f(\sqrt{\frac{1}{2}}) = (\sqrt{\frac{1}{2}})^7 + \sqrt{2}^3 > \sqrt{2}^3 = 2\sqrt{2} > 2
This shows that \sqrt{\frac{1}{2}} is placed left to the red zone in the right (see figure). A consequence is that, if u and v belong to the right red zone, then

u^2 + v^2 > \sqrt{\frac{1}{2}}^2 + \sqrt{\frac{1}{2}}^2 = 1

so u and v cannot be the sine and the cosine of the same angle (or their squares would sum up to 1). By imparity of the function, the same can be said about the left red zone. As a conclusion, it is not possible that f(u)=f(v) if u and v are the sine and cosine of a same angle and have different values. So t = \pi/4 and t = -3\pi/4 are the only two solutions to the given equation. Did anyone see another solution ?

Counting bacteria : confidence intervals from one measurement

Here are two biologically-inspired statistics problem, with very simple solutions. They show that, when you have informations about the way the observations are collected, one observation may be enough to give a confidence interval.

Question 1

From a bacterial solution I sampled a small volume v in which I found no bacteria. Give an upper bound (with confidence 95%) for the concentration of bacteria in the solution.

Question 2

From a bacterial solution I sampled a small volume v in which I found n bacteria (n > 20). Give a 95% confidence interval for the concentration of bacteria in the solution.

Solutions

Short answers : for the first question c < (3/v) and for the second c \in [\frac{ n \pm 2\sqrt{n}}{v}] .

Yep, it’s that simple ! See the solutions.

A GUI for the exploration of functions with Python / Matplotlib

Sliders can be a great tools for lazy engineers and modelists who want to play around with the parameters of a program without recompiling it a thousand times.
Most scientific languages ( python-pylab, matlab, scilab…) support some basic sliders methods but it can be quite long to create a GUI. In this blog I present a light, user-friendly function I implemented for my everyday work.

Example with a numerical function

def volume(x,y,z):
	""" Volume of a box with width x, heigth y, and depth z """
	return x*y*z

intervals = [ { 'label' :  'width',  'valmin': 1 , 'valmax': 5 },
              { 'label' :  'height',  'valmin': 1 , 'valmax': 5 },
              { 'label' :  'depth',  'valmin': 1 , 'valmax': 5 } ] 

inputExplorer(volume,intervals)

The new value of volume is automatically computed each time you move the sliders. It is also computed if you press Enter (which can be useful to get several values, when your function is non-deterministic). In case you are studying a function that is long to evaluate, you can decide that it will only be evaluated when Enter is pressed, by adding “wait_for_enter = True” to the arguments list.

Example with a graphical function

You can also decide to use a function that doesn’t return any value, but updates a plot. Let us illustrate this with the Lotka-Volterra model, in which wolves eat rabbits and die, leading to periodical fluctuations of both populations. From wikipedia:

\frac{dx}{dt} = x(\alpha - \beta y)
\frac{dy}{dt} = - y(\gamma - \delta  x)

This can be simulated easily in python with scipy’s function odeint :

import numpy as np
from scipy.integrate import odeint

# model

def model(state,t, a,b,c,d):
	x,y = state
	return [ x*(a-b*y) , -y*(c - d*x) ] # dx/dt, dy/dt

# vector of times

t_vec = np.linspace(0,10,500) # 500 time points between 0 and 10

# initial conditions

x0 = 0.5
y0 = 1

# parameters

a = 1
b = 2
c = 1
d = 3

# simulate and return the values at each time t in t_vec

result = odeint(model, [x0,y0], ts, args = (a,b,c,d) ) 

# plot

import matplotlib.pyplot as plt
plt.plot(ts,result)
plt.show()

However, trying by hand several values of x0, y0, a, b, c, d, can be fastidious. So let us wrap the solving and the plotting in a function and feed it to inputExplorer. Let me rewrite the whole code from scratch :

import matplotlib.pyplot as plt
import numpy as np
from scipy.integrate import odeint

def model(state,t, a,b,c,d):
	x,y = state
	return [ x*(a-b*y) , -y*(c - d*x) ]

ts = np.linspace(0,10,500)

fig,ax = plt.subplots(1)

def plotDynamics(x0,y0,a,b,c,d):
	ax.clear()
	ax.plot(ts, odeint(model, [x0,y0], ts, args = (a,b,c,d)) )
	fig.canvas.draw()

sliders = [ { 'label' :  label,  'valmin': 1 , 'valmax': 5 }
		 for label in [ 'x0','y0','a','b','c','d' ] ]

inputExplorer(plotDynamics,sliders)

Code

Enjoy !

import matplotlib.pyplot as plt
from matplotlib.widgets import Slider, Button

def inputExplorer(f, sliders_properties, wait_for_validation = False):
    """ A light GUI to manually explore and tune the outputs of 
        a function.
        slider_properties is a list of dicts (arguments for Slider )
        whose keys are in ( label, valmin, valmax, valinit=0.5, 
        valfmt='%1.2f', closedmin=True, closedmax=True, slidermin=None,
        slidermax=None, dragging=True)
        
        def volume(x,y,z):
            return x*y*z
    
        intervals = [ { 'label' :  'width',  'valmin': 1 , 'valmax': 5 },
                  { 'label' :  'height',  'valmin': 1 , 'valmax': 5 },
                  { 'label' :  'depth',  'valmin': 1 , 'valmax': 5 } ]
        inputExplorer(volume,intervals)
    """
        
    nVars = len(sliders_properties)
    slider_width = 1.0/nVars
    print slider_width
    
    # CREATE THE CANVAS
    
    figure,ax = plt.subplots(1)
    figure.canvas.set_window_title( "Inputs for '%s'"%(f.func_name) )
    
    # choose an appropriate height
    
    width,height = figure.get_size_inches()
    height = min(0.5*nVars,8)
    figure.set_size_inches(width,height,forward = True)
    
    
    # hide the axis
    ax.set_frame_on(False)
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)
    

    # CREATE THE SLIDERS
    
    sliders = []
    
    for i, properties in enumerate(sliders_properties):
        
        ax = plt.axes([0.1 , 0.95-0.9*(i+1)*slider_width,
                       0.8 , 0.8* slider_width])
        sliders.append( Slider(ax=ax, **properties) )
    
    
    # CREATE THE CALLBACK FUNCTIONS
    
    def on_changed(event) : 
        
        res = f(*(s.val for s in sliders))
        
        if res is not None:
            
            print res
    
    def on_key_press(event):
        
        if event.key is 'enter':
            
            on_changed(event)   
    
    figure.canvas.mpl_connect('key_press_event', on_key_press)
    
    # AUTOMATIC UPDATE ?
    
    if not wait_for_validation:
        
        for s in sliders :
            
            s.on_changed(on_changed)
    
    
    # DISPLAY THE SLIDERS
    
    plt.show()

Interception of a linear trajectory with constant speed

Today I have been asked about a very common interception problem, which was originally about pirates, but let us wrap it into a different story :

Alice just spotted a white rabbit urging to its rabbit hole ! Given the coordinates of the positions A , B, H of Alice, the bunny and the hole, as well as the respective speeds S_a and S_b of Alice and the rabbit, say whether Alice can catch the Rabbit before it disappears, and give the time and place of the fastest possible interception !

I guess that I am not the first one to solve this problem. However I couldn’t find any simple solution on the internet. So here is one, proving how helpful trigonometry can be when it comes to catching rabbits. The nice thing about it is that, while it relies on trigonometry, it doesn’t actually require to compute any trigonometrical function !

First it is obvious that, since we are looking for the fastest catch, Alice’s trajectory is a straight line. Let us call C and t_C the location and the time of the catch. We have then BC = S_b t_C and AC = S_a t_C. In our problem, finding the length BC would tell us whether Alice can catch the rabbit before is reaches the rabbit hole (case BC < BH), and would immediately lead to both the location and time of the interception :

C = B + \dfrac{BC}{BH}\overrightarrow{BH}

t_C = BC/S_b

To express BC using the coordinates of the points, let us apply the famous Law of Sines to the triangle ABC :

\dfrac{\sin \alpha}{BC} = \dfrac{\sin \beta}{AC} = \dfrac{\sin \gamma}{AB}

Wich leads to

BC = \dfrac {\sin \alpha}{\sin \gamma} AB = \dfrac {\sin \alpha}{\sin \gamma} \sqrt{(x_B-x_A)^2+(y_B-y_A)^2}

Now all we have to do is to express \sin \alpha and \sin \gamma in function of the given data. To do so we will first compute \sin \beta, then we will express \sin \alpha with \sin \beta, and finally \sin \gamma using the two first sines. The value of \sin \beta can be computed from the points coordinates as follows:

\sin \beta = \dfrac{det(\overrightarrow{BA},\overrightarrow{BH})}{ BA * BH } = \dfrac{(x_A - x_B)(y_H-y_B) - (y_A - y_B)(x_H-x_B)}{\sqrt{(x_B-x_A)^2+(y_B-y_A)^2} \sqrt{(x_B-x_H)^2+(y_B-y_H)^2}}

Then we use the Law of Sines again, to compute \sin \alpha :

\sin \alpha = \dfrac{BC}{AC} \sin \beta = \dfrac{S_b t_C}{S_a t_C} \sin \beta = \dfrac{S_b}{S_a} \sin \beta

This only makes sense, of course, if \frac{S_a}{S_b} | \sin \beta | \leq 1. If this is not the case we can deduce that Alice will never catch the rabbit, which solves the problem.
Finally we use the fact that the angles of a triangle sum to \pi to compute \sin \gamma:

\sin \gamma = \sin (\pi - \alpha - \beta) = \sin (\alpha + \beta) = \sin \alpha \cos \beta + \cos \alpha \sin \beta

\sin \gamma = \sin \alpha \sqrt{1 - \sin^2 \beta} + \sin \beta \sqrt{1 - \sin^2 \alpha}

Here we are ! Below is a script implementing this technique using Python's pylab module (but any other langage would do !).


from pylab import *

def interception(A, B, H, Sa, Sb):
	''' Returns None if A cannot catch B before B reaches H.
		Otherwise it returns the time and position of the interception.
		See attached documentation for justifications. '''
	
	sin_beta = det(array((A-B,H-B))) / ( norm(A-B) * norm(H-B) )
	
	sin_alpha = (Sb / Sa) * sin_beta
	
	if abs(sin_alpha) > 1 :
		
		print "B moves too fast to be ever caught !"
		return None
	
	else:
		
		sin_gamma = ( sin_alpha * sqrt(1 - sin_beta**2)
					+ sin_beta * sqrt(1 - sin_alpha**2) )
		
		BC = norm(B-A) * (sin_alpha / sin_gamma) 
		
		if BC > norm(H-A):
		
			print "B reaches H before being caught by A !"
			return None
		
		else:
			
			t = BC / Sb
			C = B + BC * (H-B)/norm(H-B)
	
	print "A intercepted B !"
	return t,C



##### EXAMPLE

# Define the constants

A = array(( 1.0 , 5.0 ))
B = array(( 4.0 , 1.0 ))
H =  array(( 6.0 , 7.0 ))
Sa = 1.1
Sb = 1.0 

# Find the intersection

t,C = interception(A, B, H, Sa, Sb)

# Plot the results

scatter(*zip(A,B,H,C), s=100, color='r')

for label, point in zip(['A','B','H','C'], [A,B,H,C]):
    annotate( label, xy = point, xytext = (-10, 10),
        textcoords = 'offset points', fontsize = 24)
        
annotate("", xy=H, xytext=B, xycoords='data', textcoords='data',size=20,
         arrowprops=dict(arrowstyle="simple",connectionstyle="arc3"))

annotate("", xy=C, xytext=A, xycoords='data', textcoords='data',size=20,
         arrowprops=dict(arrowstyle="simple",connectionstyle="arc3"))

title("A intercepts B in C", fontsize = 24)

show()

And here is the result :

Symbolic matrix differentiation with Sympy !

After a few days spent computing strange Jacobians for my Ph.D. thesis, I figured out that my computer could actually do most of the computations for me : all I needed was an automatic matrix differentiator, i.e. an algorithm that would tell me how a function or matrices F(X,Y,...) varies when the matrices (X,Y...) are changed into (X + \partial X, Y + \partial Y,...) where the (\partial X,\partial Y,...) are small. The very useful Matrix Cookbook gives for instance

\partial (X^T) = (\partial X)^T

\partial (XY) = (\partial X)Y + X (\partial Y)

\partial(X^{-1}) = - X^{-1}(\partial X) X^{-1}

… And so on. We can see that it is not completely unlike classic differentiation, but the non-commutativity slightly complicates things, and the computed formulae can be monstruous : imagine differentiating by hand the hat matrix H = X \left(X^\top \Sigma^{-1} X\right)^{-1} X^\top \Sigma^{-1} , where \Sigma is a constant 😯

Maybe I didn’t look properly, but I didn’t find any program to automatize the rules described above, so I went on building something around Sympy, a young yet sophisticated symbolic mathematics module for the Python programmation language, whose authors claim that it is easily extendable. I put that claim to the test and indeed I found it pretty easy to work my way through.

In what follows I explain what I did, in tutorial-like fashion, with the hope that this can help understanding how one can tweak Sympy to meet one’s own needs. But I am no Sympy expert, and I take any tips and comments in the comments section below ! If you are in a hurry, you’ll find the complete source code of the solution at the bottom of this blog.

Ready ? Code !

First, a matrix is a symbol, which is lucky because sympy already has a Symbol class. You only have to specify that these symbols are not commutative:


def matrices(names):
    ''' Call with  A,B,C = matrix('A B C') '''
    return symbols(names,commutative=False)


Matrix-specific transformations

All the regular operations like addition, substraction, multiplication… are already implemented in the Symbol class, so we can focus on matrix-specific transformations, like the operator \partial and the inverse function X \mapsto X^{-1} :

d = Function("d",commutative=False)
inv = Function("inv",commutative=False)

That was tough ! Now for the transpose function X \mapsto X^T we want the general rules

(X+Y...)^T = X^T + Y^T ...
(XY...)^T = ...Y^TX^T

to automatically apply  when transposing an expression. Here is the code I wrote for that. To understand it you need  to know that when I call some Sympy Function F over a bunch of arguments X,Y,Z, the result is an object of class “F” with a list of arguments (.args)  equal to [X,Y,Z]. Moreover successive additions or multiplications are flattened into one single object: for instance
X^2 + 2*X*Y + 3*Z + 5
will be an object of class “Add” with arguments
[X^2,2*X*Y,3*Z,5] ,
while
X^2*Y*X*Z
will be of class “Mul” with arguments
[X^2,Y,X,Z] .

class t(Function):
    ''' The transposition, with special rules
        t(A+B) = t(A) + t(B) and t(AB) = t(B)t(A) '''
    is_commutative = False
    def __new__(cls,arg):
        if arg.is_Add:
            return Add(*[t(A) for A in arg.args])
        elif arg.is_Mul:
            L = len(arg.args)
            return Mul(*[t(arg.args[L-i-1]) for i in range(L)])
        else:
            return Function.__new__(cls,arg)

That’s enough transformations for now, let us explain to sympy how to differentiate a matrix expression !

Matrix differentiation

We write all the just enough differentiation rules of the Matrix Cookbook into one dictionnary:

MATRIX_DIFF_RULES = { 
		# e =expression, s = a list of symbols respsect to which
		# we want to differentiate
		
		Symbol : lambda e,s : d(e) if (e in s) else 0,
		Add :  lambda e,s : Add(*[matDiff(arg,s) for arg in e.args]),
		Mul :  lambda e,s : Mul(matDiff(e.args[0],s),Mul(*e.args[1:]))
					  +  Mul(e.args[0],matDiff(Mul(*e.args[1:]),s)) ,
		t :   lambda e,s : t( matDiff(e.args[0],s) ),
		inv : lambda e,s : - e * matDiff(e.args[0],s) * e
}

and apply them recursively to the expression we want to treat :

def matDiff(expr,symbols):
    if expr.__class__ in MATRIX_DIFF_RULES.keys():
        return  MATRIX_DIFF_RULES[expr.__class__](expr,symbols)
    else:
        return 0

Simple as pie, and we are done ! Let’s play around with our new toy:

X,S = matrices("X S")
H= X*inv(t(X)*inv(S)*X)*t(X)*inv(S)
print  mdiff(H,X)
>>> X*(inv(t(X)*inv(S)*X)*t(d(X))*inv(S) - inv(t(X)*inv(S)*X)*(t(X)*inv(S)*d(X) + t(d(X))*inv(S)*X)*inv(t(X)*inv(S)*X)*t(X)*inv(S)) + d(X)*inv(t(X)*inv(S)*X)*t(X)*inv(S)

It works ! But we must concede that it is barely readable… so let us put some style in these expressions !

Cosmetics

You don’t need latex to have nice-looking formulae. Here is what we want to say to sympy:

  • Don’t write inv(X), write X¯¹
  • Don’t write t(X), write X’
  • Don’t write d(X), write ∂X
  • In a general way, don’t put parenthesis if the transformation applies to one symbol only.

The programmers of Sympy have made it all easy to customize the default printing method. You just write a printing method with complements for the functions of our own. First we need to include this line at the top of our source file to explain that we are going to use strange characters:

# -*- coding: utf-8 -*-

Then we write our own printer class that herits from the default one but has a few more methods :

class matStrPrinter(StrPrinter):
    ''' Nice printing for console mode : X¯¹, X', ∂X '''
    
    def _print_inv(self, expr):
		if expr.args[0].is_Symbol:
			return  self._print(expr.args[0]) +'¯¹'
		else:
			return '(' +  self._print(expr.args[0]) + ')¯¹'
    
    def _print_t(self, expr):
		return  self._print(expr.args[0]) +"'"
    
    def _print_d(self, expr):
		if expr.args[0].is_Symbol:
			return '∂'+  self._print(expr.args[0])
		else:
			return '∂('+  self._print(expr.args[0]) +')'

A this point the people who wrote the Sympy documentation want us to directly replace the default printer with this command:

Basic.__str__ = lambda self: matStrPrinter().doprint(self)

It indeed seems to be the only way to ensure that all the expressions in the recursion will print well with the standard “print” method, but it is a little to definitive for me (what if I have several customized printers to use in the same script ?). Moreover I don’t like the ‘*’ between the matrices and I didn’t find any smart way to put them away. For all these reasons I wrote my own matrix printing command:

def matPrint(m):
	mem = Basic.__str__ 
	Basic.__str__ = lambda self: matStrPrinter().doprint(self)
	print str(m).replace('*','')
	Basic.__str__ = mem

Now let’s try again:

matPrint(  matDiff(H,X) )
>>> X((X'S¯¹X)¯¹∂X'S¯¹ - (X'S¯¹X)¯¹(X'S¯¹∂X + ∂X'S¯¹X)(X'S¯¹X)¯¹X'S¯¹) + ∂X(X'S¯¹X)¯¹X'S¯¹

Way better ! Notice, however that the first X is a factor or two terms of a sum, one of these being in turn a factor involving a sum. We can break these imbricated expressions into a sum of products (which is simpler to analyze) by calling twice Sympy’s ‘expand’ function :

matPrint(  expand ( expand ( matDiff(H,X) ) ) )
>>> X(X'S¯¹X)¯¹∂X'S¯¹ + ∂X(X'S¯¹X)¯¹X'S¯¹ - X(X'S¯¹X)¯¹X'S¯¹∂X(X'S¯¹X)¯¹X'S¯¹ - X(X'S¯¹X)¯¹∂X'S¯¹X(X'S¯¹X)¯¹X'S¯¹

That’s nice enough for the console ! Now what if I want to report this fundamental result to the scientific community ? Let’s see how to generate the LaTeX code to embed the formula in a document.

Automatic LaTeX code generation

One of the nice features of Sympy is its ability to generate LaTeX code (and even display compiled formulae). For matrix computations we would like Sympy to follow these rules:

  • X^{-1} writes X^{-1}
  • X^{T} writes X^{T}
  • \partial X writes \partial X
  • In a general way, don’t put parenthesis if the transformation applies to one symbol only.

Like in the previous section we create a new ‘printer’ class with these features, and then write a function inpired by Sympy’s latex method (which is simply called latex() ):

class matLatPrinter(LatexPrinter):
    ''' Printing instructions for latex : X^{-1},  X^T, \partial X '''
	
    def _print_inv(self, expr):
        if expr.args[0].is_Symbol:
            return self._print(expr.args[0]) +'^{-1}'
        else:
            return '(' + self._print(expr.args[0]) + ')^{-1}'
    def _print_t(self, expr):
		return  self._print(expr.args[0]) +'^T'
    
    def _print_d(self, expr):
		if expr.args[0].is_Symbol:
			return '\partial '+ self._print(expr.args[0])
		else:
			return '\partial ('+ self._print(expr.args[0]) +')'

def matLatex(expr, profile=None, **kargs):
    if profile is not None:
        profile.update(kargs)
    else:
        profile = kargs
    return matLatPrinter(profile).doprint(expr)

We now try it on the derivative of H:

print matLatex( matDiff(H,X) )
>>> $X \left((X^T S^{-1} X)^{-1} \partial X^T S^{-1} - (X^T S^{-1} X)^{-1} \left(X^T S^{-1} \partial X + \partial X^T S^{-1} X\right) (X^T S^{-1} X)^{-1} X^T S^{-1}\right) + \partial X (X^T S^{-1} X)^{-1} X^T S^{-1}$

which once compiled yields
X \left((X^T S^{-1} X)^{-1} \partial X^T S^{-1} - (X^T S^{-1} X)^{-1} \left(X^T S^{-1} \partial X + \partial X^T S^{-1} X\right) (X^T S^{-1} X)^{-1} X^T S^{-1}\right) + \partial X (X^T S^{-1} X)^{-1} X^T S^{-1}

Yeah !

Source Code

I hope this helped. If you just came here to get a matrix differentiator, here is the code with the example. Have fun 🙂


# Declaration

# -*- coding: utf-8 -*-

#----------------------------------------------------------------------
#
# FUNCTIONS FOR THE AUTOMATIC DIFFERENTIATION  OF MATRICES WITH SYMPY
# 
#----------------------------------------------------------------------

from sympy import *
from sympy.printing.str import StrPrinter
from sympy.printing.latex import LatexPrinter



#####  M  E  T  H  O  D  S



def matrices(names):
    ''' Call with  A,B,C = matrix('A B C') '''
    return symbols(names,commutative=False)


# Transformations

d = Function("d",commutative=False)
inv = Function("inv",commutative=False)

class t(Function):
    ''' The transposition, with special rules
        t(A+B) = t(A) + t(B) and t(AB) = t(B)t(A) '''
    is_commutative = False
    def __new__(cls,arg):
        if arg.is_Add:
            return Add(*[t(A) for A in arg.args])
        elif arg.is_Mul:
            L = len(arg.args)
            return Mul(*[t(arg.args[L-i-1]) for i in range(L)])
        else:
            return Function.__new__(cls,arg)


# Differentiation

MATRIX_DIFF_RULES = { 
		
		# e =expression, s = a list of symbols respsect to which
		# we want to differentiate
		
		Symbol : lambda e,s : d(e) if (e in s) else 0,
		Add :  lambda e,s : Add(*[matDiff(arg,s) for arg in e.args]),
		Mul :  lambda e,s : Mul(matDiff(e.args[0],s),Mul(*e.args[1:]))
					  +  Mul(e.args[0],matDiff(Mul(*e.args[1:]),s)) ,
		t :   lambda e,s : t( matDiff(e.args[0],s) ),
		inv : lambda e,s : - e * matDiff(e.args[0],s) * e
}

def matDiff(expr,symbols):
    if expr.__class__ in MATRIX_DIFF_RULES.keys():
        return  MATRIX_DIFF_RULES[expr.__class__](expr,symbols)
    else:
        return 0



#####  C  O  S  M  E  T  I  C  S


# Console mode

class matStrPrinter(StrPrinter):
    ''' Nice printing for console mode : X¯¹, X', ∂X '''
    
    def _print_inv(self, expr):
		if expr.args[0].is_Symbol:
			return  self._print(expr.args[0]) +'¯¹'
		else:
			return '(' +  self._print(expr.args[0]) + ')¯¹'
    
    def _print_t(self, expr):
		return  self._print(expr.args[0]) +"'"
    
    def _print_d(self, expr):
		if expr.args[0].is_Symbol:
			return '∂'+  self._print(expr.args[0])
		else:
			return '∂('+  self._print(expr.args[0]) +')'	

def matPrint(m):
	mem = Basic.__str__ 
	Basic.__str__ = lambda self: matStrPrinter().doprint(self)
	print str(m).replace('*','')
	Basic.__str__ = mem


# Latex mode

class matLatPrinter(LatexPrinter):
    ''' Printing instructions for latex : X^{-1},  X^T, \partial X '''
	
    def _print_inv(self, expr):
        if expr.args[0].is_Symbol:
            return self._print(expr.args[0]) +'^{-1}'
        else:
            return '(' + self._print(expr.args[0]) + ')^{-1}'
    def _print_t(self, expr):
		return  self._print(expr.args[0]) +'^T'
    
    def _print_d(self, expr):
		if expr.args[0].is_Symbol:
			return '\partial '+ self._print(expr.args[0])
		else:
			return '\partial ('+ self._print(expr.args[0]) +')'

def matLatex(expr, profile=None, **kargs):
    if profile is not None:
        profile.update(kargs)
    else:
        profile = kargs
    return matLatPrinter(profile).doprint(expr)



#####    T  E  S  T  S


X,S = matrices("X S")
H= X*inv(t(X)*inv(S)*X)*t(X)*inv(S)

matPrint(  expand( expand( matDiff(H,X) ) ) )

print matLatex( matDiff(H,X) )

Incommensuration with Platonions

A set of incommensurate (or incommensurable) numbers is a collection of reals such that it is not possible to obtain one of them by multiplying the others by rational coefficients and summing the results. A more formal definition states that the family (I_i)_{i \leq N} is incommensurate  if for any set of integer coefficients (\lambda_i )_{i \leq N} such that

\sum_{i \leq N}\lambda_i\,I_i\,=0

it follows that the \lambda_i are all null (N can be infinite). Such numbers are quite useful : for instance if (\omega_1,\omega_2,...\omega_N) are incommensurate, then the vector V(t)=\big(cos(\omega_1 t),cos(\omega_2 t),...cos( \omega_N t) \big) can take any value in [-1;1]^N as long as we set the variable t appropriately. I recently found this trick used in Fourier amplitude sensitivity testing where it enables to collapse multiple integrals into single-dimensional ones.

Now that we have an idea why we could need them for, how do we find incommensurate numbers ? I have been surprised of how few one can find on this subject on the internet (links are welcome in the comments section !). It is clear that irrationals are good candidates, and among irrationals, square-roots of integers seem likely to provide incommensurate families. But not all of them : for instance \sqrt{2} and \sqrt{8} are not incommensurable ( since 2\sqrt{2}=\sqrt{8}) ! In a very pleasant and erudite text, Kevin Brown focuses on the numbers of the form \sqrt{N} where the integer N is not a multiple of the square of an integer bigger than 1:

 P = \{\sqrt{n} \in \mathbb{N}, n\geq 1 \} - \{\sqrt{k*a^2} | (k,a) \in \mathbb{N}^2, k\geq 1,a \geq2 \}

The first elements of this set are

\sqrt{1},\sqrt{2},\sqrt{3},\sqrt{5},\sqrt{7},\sqrt{10},\sqrt{11},\sqrt{13},\sqrt{14},\sqrt{15}, \\ \sqrt{17},\sqrt{19},\sqrt{21},\sqrt{22},\sqrt{23},\sqrt{26},\sqrt{29},...

Any finite subset of P , Brown shows, is incommensurate. The reals that you can produce with elements of P are named Platonions  (because of Platon and the quarternions… yep), but for the sake of simplicity I will call Platonions the elements of P (which, in Brown’s vocable, would be called a Platonic field basis…).  These numbers provide a very cheap way of generating incommensurate sets : just take an array of the first N integers, discards the integers of the form k*a^2 (necessarily we will have 2\leq a\leq \sqrt{N} and 1\leq k \leq N/a^2 ) and return the square roots of the remaining values. Here is the Python code:

def platonions_up_to(N):
    """ Returns numbers of the form sqrt(I) where I is not a
    multiple of a square. Such numbers are incommensurable."""
    squareMultiples = [k*(a**2) for a in range(2,int(math.sqrt(N))+2)
                                for k in range(1,1+int((N+1)/(a**2)))]
    return [math.sqrt(i) for i in range(1,N+1) if not i in squareMultiples]

Now what if we want exactly K incommensurate numbers ? How big should N be to be sure to find at least K Platonions inferior to N ? The good news is that Platonions are plentyfull. More precisely, the fraction of integers smaller than N that are Platonions is pretty much constant:

import matplotlib.pyplot as plt
Nlist = range(50,5051,500) # N = 50,550,1050,1550... 5050
proportionslist = [1.0*len(platonions_up_to(N))/N for N in Nlist]
plt.plot(Nlist,proportionslist);plt.show()
Platonion fraction

Fraction of integers smaller than N (x-axis) whose square-root is a Platonion

We see that the fraction integers smaller than N whose square-root is a Platonion is always bigger than 0.60. To be sure to find at least K Platonions we thus need to consider the K/0.60 first integers only. In Python:

def findKincommensurates(K):
    return platonions_up_to(int(K/0.60))[:K]

That’s as far as I will go today ! You can now reread the title and assess the progress you made in a topic that you had no idea existed a few minutes ago.