Delay differential equations in Python

I wrote a very simple and user-friendly method, that I called ddeint, to solve delay differential equations (DDEs) in Python, using the ODE solving capabilities of the Python package Scipy. As usual the code is available at the end of the post :).

Example 1 : Sine

I start with an example whose exact solution is known so that I can check that the algorithm works as expected. We consider the following equation:

y(t) = sin(t) \ \ \ for\ \ \ t < 0

y'(t) = y(t-3\pi/2) \ \ \ for\ \ \ t \geq 0

The trick here is that sin(t-3\pi/2) = cos(t)=sin'(t) so the exact solution of this equation is actually the sine function.

from pylab import *

model = lambda Y,t : Y(t - 3*pi/2) # Model
tt = linspace(0,50,10000) # Time start, end, and number of points/steps
g=sin # Expression of Y(t) before the integration interval 
yy = ddeint(model,g,tt) # Solving

# PLOTTING
fig,ax=subplots(1)
ax.plot(tt,yy,c='r',label="$y(t)$")
ax.plot(tt,sin(tt),c='b',label="$sin(t)$")
ax.set_ylim(ymax=2) # make room for the legend
ax.legend()
show()

dde_sin

The resulting plot compares our solution (red) with the exact solution (blue). See how our result eventually detaches itself from the actual solution as a consequence of many successive approximations ? As DDEs tend to create chaotic behaviors, you can expect the error to explode very fast. As I am no DDE expert, I would recommend checking for convergence in all cases, i.e. increasing the time resolution and see how it affects the result. Keep in mind that the past values of Y(t) are computed by interpolating the values of Y found at the previous integration points, so the more points you ask for, the more precise your result.

Example 2 : Delayed negative feedback

You can select the parameters of your model at integration time, like in Scipy’s ODE and odeint. As an example, imagine a product with degradation rate r, and whose production rate is negatively linked to the quantity of this same product at the time (t-d):

y(t) = 0 \ \ \ for\ \ \ t < 0

y'(t) = \dfrac{1}{1+(\dfrac{y(t-d)}{K})^2} -ry(t) \ \ \ for\ \ \ t \geq 0

We have three parameters that we can choose freely. For K = 0.1, d = 5, r = 1, we obtain oscillations !

from pylab import *

# MODEL, WITH UNKNOWN PARAMETERS
model = lambda Y,t,k,d,r :  1/(1+(Y(t-d)/k)**2) - r*Y(t)

# HISTORY
g = lambda t:0 

# SOLVING
tt = linspace(0,50,10000)
yy = ddeint(model,g,tt,fargs=( 0.1 , 5 , 1 )) # K = 0.1, d = 5, r = 1

# PLOTTING
fig,ax=subplots(1)
ax.plot(tt,yy,lw=2)
show()

dde_negativefeedback

Example 3 : Lotka-Volterra system with delay

The variable Y can be a vector, which means that you can solve DDE systems of several variables. Here is a version of the famous Lotka-Volterra two-variables system, where we introduce some delay d. For d=0 we find the solution of a classical Lotka-Volterra system, and for d non-nul, the system undergoes an important amplification:

\big(x(t), y(t)\big) = (1,2) \ \ \ for\ \ t < 0, \ \ else

x'(t) = 0.5x(t)\big(1-y(t-d)\big)\\   y'(t) = -0.5y(t)\big(1-x(t-d)\big)

from pylab import *

def model(Y,t,d):
    x,y = Y(t)
    xd,yd = Y(t-d)
    return array([0.5*x*(1-yd), -0.5*y*(1-xd)])

g = lambda t : array([1,2])
tt = linspace(2,30,20000)
fig,ax=subplots(1)

for d in [0, 0.2]:
    yy = ddeint(model,g,tt,fargs=(d,))
    # WE PLOT X AGAINST Y
    ax.plot(yy[:,0],yy[:,1],lw=2,label='delay = %.01f'%d)

ax.legend()
show()

dde_lotka

Example 4 : A DDE with varying delay

This time the delay depends on the value of Y(t) !

y(t) = 1,\ \ \ t \leq 0

y'(t) = - y\big(t-3\cos(y(t))^2 \big),\ \ \ t > 0

from pylab import *
model = lambda Y,t:  -Y(t-3*cos(Y(t))**2)
tt = linspace(0,30,2000)
yy = ddeint(model, lambda t:1, tt)
fig,ax=subplots(1)
ax.plot(tt,yy,lw=2)
show()

dde_Ydependent

Code

Explanations

The code is written on top of Scipy’s ‘ode’ (ordinary differential equation) class, which accepts differential equations under the form

model(Y,t) = ” expression of Y'(t) ”

where Y, and the output Y', must be Numpy arrays (i.e. vectors).

For our needs, we need the input Y to be a function of time, more precisely a function that can compute Y(t) at any past or present t using the values of Y already computed. We also need Y(t) to return the value of some function g(t) if t is inferior to some time tc that marks the start of the integration.

To this end, I first implemented a class (ddeVar) of variables/functions which can be called at any time t: for $t<latex t_c$, it will return the value of g(t), and for t>tc, it will look for two already computed values Y_a and Y_b at times t_a<t<t_b, from which it will deduce Y(t) using a linear interpolation. Scipy offers many other kinds of interpolation, but these will be slower and won't support vectors for Y.

Such variables need to be updated every time a new value of Y(t) is computed, so I created a class 'dde' that inherits from Scipy's 'ode' class but overwrites its integration method so that our special function Y is updated after each integration step. Since 'ode' would feed the model with a vector Y (a Numpy array to be precise), which we don't want, we give to the integrator an interface function that takes a Numpy array Y as an argument, but immediately dumps it and calls the model with our special ddeVar variable Y (I hope that was clear 🙂 ).

Ok, here you are for the code

You will find the code and all the examples as an IPython notebook HERE (if you are a scientific pythonist and you don’t know about the IPython notebook, you are really missing something !). Just change the extension to .ipynb to be able to open it. In case you just asked for the code:

# REQUIRES PACKAGES Numpy AND Scipy INSTALLED
import numpy as np
import scipy.integrate
import scipy.interpolate
 
class ddeVar:
    """ special function-like variables for the integration of DDEs """
    
    
    def __init__(self,g,tc=0):
        """ g(t) = expression of Y(t) for t<tc """
        
        self.g = g
        self.tc= tc
        # We must fill the interpolator with 2 points minimum
        self.itpr = scipy.interpolate.interp1d(
            np.array([tc-1,tc]), # X
            np.array([self.g(tc),self.g(tc)]).T, # Y
            kind='linear', bounds_error=False,
            fill_value = self.g(tc))
            
            
    def update(self,t,Y):
        """ Add one new (ti,yi) to the interpolator """
        
        self.itpr.x = np.hstack([self.itpr.x, [t]])
        Y2 = Y if (Y.size==1) else np.array([Y]).T
        self.itpr.y = np.hstack([self.itpr.y, Y2])
        self.itpr.fill_value = Y
        
        
    def __call__(self,t=0):
        """ Y(t) will return the instance's value at time t """
        
        return (self.g(t) if (t<=self.tc) else self.itpr(t))
 


class dde(scipy.integrate.ode):
    """ Overwrites a few functions of scipy.integrate.ode"""
    
    
    def __init__(self,f,jac=None):
		
        def f2(t,y,args):
            return f(self.Y,t,*args)
        scipy.integrate.ode.__init__(self,f2,jac)
        self.set_f_params(None)
        
 
    def integrate(self, t, step=0, relax=0):
		
        scipy.integrate.ode.integrate(self,t,step,relax)
        self.Y.update(self.t,self.y)
        return self.y
        
 
    def set_initial_value(self,Y):
		
        self.Y = Y #!!! Y will be modified during integration
        scipy.integrate.ode.set_initial_value(self, Y(Y.tc), Y.tc)
 
 
 
def ddeint(func,g,tt,fargs=None):
    """ similar to scipy.integrate.odeint. Solves the DDE system
        defined by func at the times tt with 'history function' g
        and potential additional arguments for the model, fargs
    """
    
    dde_ = dde(func)
    dde_.set_initial_value(ddeVar(g,tt[0]))
    dde_.set_f_params(fargs if fargs else [])
    return np.array([g(tt[0])]+[dde_.integrate(dde_.t + dt)
                                 for dt in np.diff(tt)])

Other implementations

If you need a faster or more reliable implementation, have a look at the packages pyDDE and pydelay, which seem both very serious but are less friendly in their syntax.

Advertisements

Typing keyboard + python = Musical instrument !

In this post I will show you how to do this :

I am not the first to do that, but most people who play their computer on Youtube use either very expensive programs, or programs that won’t run on your computer, or not-so-efficient programs with not-that-much possibilities of extension, or cheap programs with a big lag between pressing the key and actually hearing the note.

So here is a very small Python script which will run fine even on a basic netbook. If you are not familiar with Python, you should take online courses , it is really worth it 🙂 !
If you are faminiliar with Python, then you are welcome to improve the code on its Github page.

Transforming your keyboard into a mixtable

My original idea was to transform my computer keyboard into a mixtable, to make something like in this very awesome video:

So my first move was to make a program that would take a configuration file my_configuration.cf containing this:


q, dog.wav
w, cat.wav
e, tarzan.wav
r, scream.mp3

And then if you hit q you would hear a dog from the dog.wav soundfile, if you hit w you’d hear a cat, etc…
This is pretty easy to do with Python’s pygame package. Here is my code (inspired by similar stuff from the package Mingus):


import pygame as pg
import csv
import time


SAMPLE_WIDTH = 16
FPS = 44100
N_CHANNELS = 2
BUFFER = 2**9
    
def minimix(config_file,mode = 'instrument'):
    """
    Opens an interface that lets you press the keys of your keyboard
    to plays the related soundfiles as sepcified in the provided
    configuration file.
    
    Args:
       config_file (str):  A file associating keyboard keys with
                           file names. Each line must be of the form
                           key , filename.
                       
       mode (str) :
            instrument -- causes autorepeat of samples as long
                            as the key is pressed.
            sustain -- causes autorepeat of the samples until its
                       key is pressed again.
            player -- striking the key causes the sound to play once. 
    
    Returns:
       a list of the (time,events).
    """
    
    repeat = 0 if (mode is 'player') else (-1)
    
    pg.mixer.pre_init(FPS,-SAMPLE_WIDTH,N_CHANNELS,BUFFER)
    pg.init()
    screen = pg.display.set_mode((640,480))
    
    
    
    
    ##### READ CONF FILE
    
    key2sound = {}
    key2file = {}
    config = csv.reader(open(config_file, 'rb'), delimiter=',')
    
    for key, soundfile in config:
        
        key,soundfile = key.strip(' '),soundfile.strip(' ')
        
        if key is not '#':
            
            key2file[key] = soundfile
            key2sound[key] = pg.mixer.Sound(soundfile)
    
    events_list = []
    currently_playing = {k : False for k in key2sound.iterkeys()}
    
    
    ##### MAIN LOOP
    
    while True:
        
        event =  pg.event.wait()
      
        if event.type in (pg.KEYDOWN,pg.KEYUP):
            key = pg.key.name(event.key)
              
            if key in key2sound:
                
                if event.type == pg.KEYDOWN:
                    
                    if (mode == 'sustain') and currently_playing[key]:
                        
                        key2sound[key].fadeout(20)
                        currently_playing[key] = False
                        
                    else:
                        
                        key2sound[key].play(repeat) 
                        currently_playing[key] = True
                    
                    events_list.append((time.time(),key2file[key]))
                    
                elif event.type == pg.KEYUP and (mode == 'instrument'):
                    
                    key2sound[key].stop()
                    currently_playing[key] = False
                    
                    events_list.append((time.time(),key2file[key]))
            
            elif event.key == pg.K_ESCAPE:
                
                break
    
    pg.quit()
    
    return events_list

Transforming your keyboard into a musical instrument

If instead of using various noises like cat and dog you use different notes from the same instrument, then you turned your computer into some kind of piano. The problem is that a set of soundfiles with all the notes of an instrument is difficult to find on the internet, so I wrote a script that makes as many notes as you want from just one sound by shifting its pitch up or down. It uses the audio processing program Soundstretch that you will need to install first :

import os

def shift_wav(wavfile,output,shifts,verbose=False):
    """
    Makes new sounds by shifting the pitch of a sound.
    Requires soundstrech installed.
    
    Args:
        wavfile : Name of the file containing the original sound
        output: name to use as a prefix for the output files and for
                the output folder name
        shifts (list of int): specifies of how many half-tones the pitch
                shifts should be. For instance [-2,-1,1,2] will produce
                4 files containing the sound 2 half-tones lower, one
                halftone lower, one halftone higher and two halftones
                higher.
    """
    
    folder = os.path.dirname(output)
        
    if not os.path.exists(folder):
        
        os.makedirs(folder)
        
    for i,s in enumerate(shifts):
        
        outputfile = '%s%02d.wav'%(output,i)
        
        command = 'soundstretch %s %s -pitch=%d'%(wavfile,outputfile,s)
        if verbose:
            print command
        os.system(command)

Going further

There is so much one could do to improve the program.

On the musical side, for instance, finding configurations of the keyboard that are particularly ergonomic. In the video above I used this configuration:

azerty

I called it typewriter because it enables you to play very fast things while moving your hands a minimum (the video is a bad example :P) . But maybe there is better to find !

Also, one could start listing every cool piece of music that can be played on a typing keyboard. I use 46 keys ( almost 4 octaves !), that makes a lot of possibilities !

On the programming side, there is a lot of little things I can think of, like automatizing scale changes, introducing nuances, designing nice interfaces (why not a guitar-hero-like game where you would actually be playing music on a playback ?), writing a script that would take some sheet music (in a nice format, like ABC, MIDI, lylipond) and return the list of the keys you should strike to play it.

I actually wrote a lot more code, for instance to make it easier to write configuration files, for sound processing, etc., but since it is not strictly necessary I am not reporting it here (I’ll certainly put a working version on GitHub, or such, some day).

Philosophy of the musical keyboard

Your typing keyboard is a real instrument. Of course it is not its primary use, but our voice’s primary purpose was not to sing, either. Now do the math : how many people out there have a piano at home ? And how many have a computer ? That gives you an idea of how many people would like to play the piano, cannot, but could play their computers instead.

So promoting computer-keyboardism is ultimately about bringing music to the masses. It is about providing everyone with an instrument that you can practice at home, in the train, at work, and that will be familiar to anyone everywhere in the world.

There is more : how many of you, pianist readers, have started the piano for seduction purposes ? (yeah, sure, me neither…) But public places with a piano on which you could show off your mad skills are getting pretty rare, aren’t they ? Especially since most bars have traded their good old piano for a TV. But think about all these places with a computer at hand ! Yep, time for you to develop a talent that will be useful in real life !

So practice, get good, be one of the first composers for tomorrow’s instrument, impress your friends and spread the good news ! If you are still reading me after so much gibberish , then do not hesitate : you just proved how little you value your free time, you are the right person for the task !

An analytical solution to a Jewish problem

Maybe you have heard about the article of Khovanova and Radul, Jewish Problems, in which they collect tricky mathemical problems that were alledgedly designed to prevent Jews (to which they were specifically given) from passing the oral entrance exams of Moscow State University . With such a story and such a naïvely equivocal title, you can count on all the comments-section jewishologists of the world to promote your article 🙂

What I found interesting is this article is that, of the 21 problems given, I could answer most of the analytical ones, while I was completely unequipped to solve the geometrical problems. Has geometry always been that hard ? My impression is that, at least here in France, its teaching is slowly being replaced by linear algebra . Maybe it is simply less needed today. Take civil engineering : a beam used to be some kind of parralelepipede, now it is a grid of finite elements !

As an illustration, let us have a look at the fifth problem: solve the equation

\sin^7(x) + \dfrac{1}{\sin^3x} = \cos^7(x) + \dfrac{1}{\cos^3x}.

What kind of math problem do you see here ? (let’s just hope it has nothing to do with geometry !) The authors of the article give a solution based on some transformation of the equations and a few trigonometricks (at some point they use the variable t=sin(x)cos(x) and retransform it into sin(2x)/2). What I first saw when I read the question (and I bet that’s what you saw too if you have my kind of training) was an analytical problem involving the function

f(t) = t^7 + \dfrac{1}{t^3}

for which it is asked to find all the (u,v) which are the sine and cosine of a same angle and verify f(u)=f(v) ! An obvious situation where this will work is when u=v, which leads us to the solutions t = \pi/4 and t = -3\pi/4, which are the only angles whose sine is equal to the cosine. Now, are there any other solutions ? In the end what is asked is whether the function f is bijective, or not, or just enough ! Let us compute its derivative:

f'(t) = 7t^6 - 3 \dfrac{1}{t^4} = \dfrac{7t^{10}-3}{t^4}

This is only positive outside the interval [-\sqrt[10]{\frac{7}{3}} , +\sqrt[10]{\frac{7}{3}}] . We can now sketch the function:

Since u and v are sine and cosine, they will belong to the interval [-1,1]. In red I have represented two non-bijectivity zones of the function on this interval: if u and v are two different numbers of [-1,1] verifying f(u)=f(v), then the two of them must belong to one of these red zones. Now, notice that
f(\sqrt{\frac{1}{2}}) = (\sqrt{\frac{1}{2}})^7 + \sqrt{2}^3 > \sqrt{2}^3 = 2\sqrt{2} > 2
This shows that \sqrt{\frac{1}{2}} is placed left to the red zone in the right (see figure). A consequence is that, if u and v belong to the right red zone, then

u^2 + v^2 > \sqrt{\frac{1}{2}}^2 + \sqrt{\frac{1}{2}}^2 = 1

so u and v cannot be the sine and the cosine of the same angle (or their squares would sum up to 1). By imparity of the function, the same can be said about the left red zone. As a conclusion, it is not possible that f(u)=f(v) if u and v are the sine and cosine of a same angle and have different values. So t = \pi/4 and t = -3\pi/4 are the only two solutions to the given equation. Did anyone see another solution ?

Counting bacteria : confidence intervals from one measurement

Here are two biologically-inspired statistics problem, with very simple solutions. They show that, when you have informations about the way the observations are collected, one observation may be enough to give a confidence interval.

Question 1

From a bacterial solution I sampled a small volume v in which I found no bacteria. Give an upper bound (with confidence 95%) for the concentration of bacteria in the solution.

Question 2

From a bacterial solution I sampled a small volume v in which I found n bacteria (n > 20). Give a 95% confidence interval for the concentration of bacteria in the solution.

Solutions

Short answers : for the first question c < (3/v) and for the second c \in [\frac{ n \pm 2\sqrt{n}}{v}] .

Yep, it’s that simple ! See the solutions.

Finding a subnetwork with a given topology in E. Coli

A few weeks ago I posted about how easy it was, using Python’s Networkx module and the RegulonDB database, to get and navigate through the genetic network of E. Coli. I didn’t mention at the time that you can also find all the subnetworks of E. Coli sharing a given topology with just a few lines of code:

import networkx.algorithms.isomorphism as iso
def find_pattern(graph,pattern,sign_sensitive = True):

    if sign_sensitive:
        edge_match = lambda e1,e2 : (e1['sign']==e2['sign'])
    else:
        edge_match = None
    matcher = iso.DiGraphMatcher(graph,pattern,edge_match=edge_match)
    return list(matcher.subgraph_isomorphisms())

Feedforward loops in E. coli

As an example, let us have a look at the feedforward loops in E. coli. A feedforward loop can be described as a gene having an action on another gene, both directly and through the intermediary of another gene, like in the following sketch, where the arrows can represent activations or repressions

A feedforward !

If you want to print all the feedforward loops in E. coli’s network (as represented by RegulonDB), try this :

import networkx as nx
feedforward = nx.DiGraph()
feedforward.add_edges_from([('A','I'),('I','C'),('A','C')])
ffwd_in_ecoli = find_pattern(ECN,feedforward,sign_sensitive=False)
for subgraph in ffwd_in_ecoli:
    print subgraph

This prints each feedforward loop with the respective roles of the different genes (A,I, or C). As you can see there are 63 feedforward loops in regulonDB’s network, which is many. Let us now get sign-specific :

import itertools as itt
labels = []
frequencies = []
for signs in itt.product(('+','-'),repeat = 3):

    ffwd = nx.DiGraph()
    ffwd.add_edges_from([('A','I',{'sign':signs[0]}),
                         ('I','B',{'sign':signs[1]}),
                         ('A','B',{'sign':signs[2]})])
    frequency = len(list(find_pattern(ECN,ffwd,True)))
    frequencies.append(frequency)
    labels.append(" ".join(signs))

frequencies = array([float(f) for f in frequencies])/sum(frequencies)
fig,ax = subplots(1, figsize=(5,5))
title('feedforward loops in E. coli, \n by signs of A->I,I->B,A->B')
pie(frequencies, labels=labels, autopct='%1.1f%%', shadow=True)
show()

Relative frequencies of the different feedforward loops in E. coli

I’m not the first one to do such reasearch, but that won’t prevent me from commenting 🙂 The most frequent pattern can be seen as a double repression : gene A represses gene B, and, as an additional punishment, also represses the activator I of B. Such feedforwards are called coherent, as the two actions of the gene A have the same effect on the gene B. The second most frequent feedforward is incoherent: gene A activates gene B while activating a repressor of B, which seems a little foolish, but can be a way of creating bumps of activity (B is awaken just a few minutes before its repressor puts it down again).

A word of caution

Note that the algorithm didn’t find any feedforward loop with activations only (+++), while some have been reported in the literature. The same way, I couldn’t find any mutual interaction between two genes (A has an action on B and B has an action on A). So if you are going to use regulonDB’s network of interactions for reasearch purposes, always have in mind :

  • The database is INCOMPLETE, meaning that you can use it (carefully) to deduce the existence of stuff, but not to prove the absence of something. In particular, many gene interactions that occur through the intermediary of metabolites are omited ! For instance, the gene cyaA produces the cyclic AMP that is necessary to activate the genes regulated by crp. However only crp appears as a regulator of these genes.
  • In the context of synthetic biology, the few genes added to the bacteria are often supposed to work independently of the rest of the bacteria (some biologists say that they are orthogonal to the rest of the system). Thus, if you design a feedforward loop, it will be a feedforward loop and nothing else. But keep in mind that when the pattern-matching algorithm finds a feedforward loop in E. coli’s networks, each gene of the triangle could also be under the influence of many other, and maybe the behavior of the circuit is not what could be inferred by looking only at these three genes.
  • The algorithm consider isomorphism between subnetworks, which implies that it won’t mind if the subnetwork is under the influence of external genes, but it will mind if the subnetwork has auto-regulations that do not appear on the provided pattern. For instance, fis regulates crp and vice-versa, but this won’t be reported if you simply look for the pattern A\leftrightarrow B because, in addition, fis and crp regulate themselves !

Overstressed ? Here, take some bacteria !

The European jamboree of the 2012 iGEM competition (a synthetic biology contest for undergraduates) is over. And our team didn’t make it through to the world finals at Boston’s MIT 😦 .
Given that the event took place in Amsterdam, maybe this small project of mine would have given us more chances:

Hallucicoli, symbiosis meets happiness

Not sure, however, that it would be appreciated in Boston, as the FBI is one of iGEM’s main sponsors !

So what do you think… Feasible ? Marketable ?

It is a little akward to put such things on the internet, expecially if you, reader, are my future employer, so here is a disclaimer: this is just for fun (note, however, that right now real people are leading a real project to make bacteria produce THC). I do not support the use of drugs under any form and I never, ever had weed in my life (which may not be the case of many of the European iGEMers, who spent three very happy days in Amsterdam !)

Extract data from graph pictures with Python

If you want to transform a picture of a graph into exploitable data (which is very useful in science if you want to exploit a figure from an article without bothering the authors), here is a minimalistic interface written in python with the following features:

  • Data extraction from picture files or from a picture in the clipboard.
  • Data extraction from rotated graphs or graphs shown with (moderate) perspective.
  • Advanced interface (left-click to select a point, right-click to deselect).
  • Stores the points’ coordinates in a python variable and in the clipboard (for use in another application).

You can launch the interface with

points = pic2data()

This will either start a session using the picture from the clipboard, or , if there is none, wait for the clipboard to contain a picture. Alternatively you can use a picture from a file with

points = pic2data('graph.jpeg')

You will then be asked you to place the origin of the graph, as well as the coordinates of this origin (in case it it not (0,0)), and one reference point for each axis X and Y (i.e. points of these axis whose coordinates you know). Then you can select/deselect as many points of the curve as you want, and exit with the middle button.The list of selected points [(x1,y1),(x2,y2),…] is returned.

By default the program will consider that the graph is rectangular and parralel to the edges of the pictures (wich I will call straight in what follows). This will typically be the case for a graph from a scientific article. As a consequence the algorithm will automatically replace the reference point you chose for the X axis in order to put it at the same height as the origin, and it will replace the reference point for Y exactly above the origin. However if the graph on the picture is not straight, like in a photo, use the argument straight=False.

As an example, let us take a photo with a graph, like this one.

Fig. 1: Young Frederic Chopin disguised as Mozart.

As the graph is not straight we will use

points = pic2data('mozart.jpeg', straight = False)

Which gets you to that:

After placing the points and getting their coordinates one can redraw the plot with

from pylab import *
figure()
x,y = zip(*points)
plot(x,y,'o')
show()

And voilà !

Here is the code. Happy curving !

from urlparse import urlparse

import pygtk
import gtk
import tkSimpleDialog

import matplotlib.image as mpimg
import matplotlib.pyplot as plt

import numpy as np

def tellme(s):
    print s
    plt.title(s,fontsize=16)
    plt.draw()

def pic2data(source='clipboard',straight=True):
    """ GUI to get data from a XY graph image. Either provide the graph
        as a path to an image in 'source' or copy it to the clipboard.
    """
    
    ##### GET THE IMAGE
    
    clipboard = gtk.clipboard_get()
    
    if source=='clipboard':
        
        # This chunk tries the text content of the clipboard
        # and empties it if it is not a file path
        
        print "Waiting for an image in the clipboard..." 
        while not ( clipboard.wait_is_uris_available()
                    or clipboard.wait_is_image_available()):
            pass
            
        if clipboard.wait_is_uris_available(): # it's a path to a file !
            
             source = clipboard.wait_for_uris()[0]
             source = urlparse(source).path
             return pic2data(source)
        
        image = clipboard.wait_for_image().get_pixels_array()
        origin = 'upper'
    
    else: # source is a path to a file !
        
        image = mpimg.imread(source)
        origin = 'lower'

    ###### DISPLAY THE IMAGE
    
    plt.ion() # interactive mode !
    fig, ax = plt.subplots(1)
    imgplot = ax.imshow(image, origin=origin)
    fig.canvas.draw()
    plt.draw()
    
    ##### PROMPT THE AXES
    
    def promptPoint(text=None):
        
        if text is not None: tellme(text)
        return  np.array(plt.ginput(1,timeout=-1)[0])
    
    def askValue(text='',initialvalue=0.0):
        return tkSimpleDialog.askfloat(text, 'Value:',
                                         initialvalue=initialvalue)
    
    origin = promptPoint('Place the origin')
    origin_value = askValue('X origin',0),askValue('Y origin',0)
                                         
    Xref =  promptPoint('Place the X reference')
    Xref_value = askValue('X reference',1.0)
    
    Yref =  promptPoint('Place the Y reference')
    Yref_value = askValue('Y reference',1.0)
    
    if straight :
        
        Xref[1] = origin[1]
        Yref[0] = origin[0]
    
    ##### PROMPT THE POINTS
    
    selected_points = []
    
    tellme("Select your points !")
    print "Right-click or press 's' to select"
    print "Left-click or press 'del' to deselect"
    print "Middle-click or press 'Enter' to confirm"
    print "Note that the keyboard may not work."
    
    selected_points = plt.ginput(-1,timeout=-1)
    
    ##### RETURN THE POINTS COORDINATES
    
    #~ selected_points.sort() # sorts the points in increasing x order
    
    # compute the coordinates of the points in the user-defined system
    
    OXref = Xref - origin
    OYref = Yref - origin
    xScale =  (Xref_value - origin_value[0]) / np.linalg.norm(OXref)
    yScale =  (Yref_value - origin_value[1]) / np.linalg.norm(OYref)
    
    ux = OXref / np.linalg.norm(OXref)
    uy = OYref / np.linalg.norm(OYref)
    
    result = [(ux.dot(pt - origin) * xScale + origin_value[0],
               uy.dot(pt - origin) * yScale + origin_value[1])
               for pt in selected_points ]
    
    # copy the result to the clipboard
    
    
    clipboard.set_text('[' + '\n'.join([str(p) for p in result]) + ']')
    
    clipboard.store() # makes the data available to other applications
    
    plt.ioff()
    
    return result