Extract data from graph pictures with Python

If you want to transform a picture of a graph into exploitable data (which is very useful in science if you want to exploit a figure from an article without bothering the authors), here is a minimalistic interface written in python with the following features:

  • Data extraction from picture files or from a picture in the clipboard.
  • Data extraction from rotated graphs or graphs shown with (moderate) perspective.
  • Advanced interface (left-click to select a point, right-click to deselect).
  • Stores the points’ coordinates in a python variable and in the clipboard (for use in another application).

You can launch the interface with

points = pic2data()

This will either start a session using the picture from the clipboard, or , if there is none, wait for the clipboard to contain a picture. Alternatively you can use a picture from a file with

points = pic2data('graph.jpeg')

You will then be asked you to place the origin of the graph, as well as the coordinates of this origin (in case it it not (0,0)), and one reference point for each axis X and Y (i.e. points of these axis whose coordinates you know). Then you can select/deselect as many points of the curve as you want, and exit with the middle button.The list of selected points [(x1,y1),(x2,y2),…] is returned.

By default the program will consider that the graph is rectangular and parralel to the edges of the pictures (wich I will call straight in what follows). This will typically be the case for a graph from a scientific article. As a consequence the algorithm will automatically replace the reference point you chose for the X axis in order to put it at the same height as the origin, and it will replace the reference point for Y exactly above the origin. However if the graph on the picture is not straight, like in a photo, use the argument straight=False.

As an example, let us take a photo with a graph, like this one.

Fig. 1: Young Frederic Chopin disguised as Mozart.

As the graph is not straight we will use

points = pic2data('mozart.jpeg', straight = False)

Which gets you to that:

After placing the points and getting their coordinates one can redraw the plot with

from pylab import *
figure()
x,y = zip(*points)
plot(x,y,'o')
show()

And voilà !

Here is the code. Happy curving !

from urlparse import urlparse

import pygtk
import gtk
import tkSimpleDialog

import matplotlib.image as mpimg
import matplotlib.pyplot as plt

import numpy as np

def tellme(s):
    print s
    plt.title(s,fontsize=16)
    plt.draw()

def pic2data(source='clipboard',straight=True):
    """ GUI to get data from a XY graph image. Either provide the graph
        as a path to an image in 'source' or copy it to the clipboard.
    """
    
    ##### GET THE IMAGE
    
    clipboard = gtk.clipboard_get()
    
    if source=='clipboard':
        
        # This chunk tries the text content of the clipboard
        # and empties it if it is not a file path
        
        print "Waiting for an image in the clipboard..." 
        while not ( clipboard.wait_is_uris_available()
                    or clipboard.wait_is_image_available()):
            pass
            
        if clipboard.wait_is_uris_available(): # it's a path to a file !
            
             source = clipboard.wait_for_uris()[0]
             source = urlparse(source).path
             return pic2data(source)
        
        image = clipboard.wait_for_image().get_pixels_array()
        origin = 'upper'
    
    else: # source is a path to a file !
        
        image = mpimg.imread(source)
        origin = 'lower'

    ###### DISPLAY THE IMAGE
    
    plt.ion() # interactive mode !
    fig, ax = plt.subplots(1)
    imgplot = ax.imshow(image, origin=origin)
    fig.canvas.draw()
    plt.draw()
    
    ##### PROMPT THE AXES
    
    def promptPoint(text=None):
        
        if text is not None: tellme(text)
        return  np.array(plt.ginput(1,timeout=-1)[0])
    
    def askValue(text='',initialvalue=0.0):
        return tkSimpleDialog.askfloat(text, 'Value:',
                                         initialvalue=initialvalue)
    
    origin = promptPoint('Place the origin')
    origin_value = askValue('X origin',0),askValue('Y origin',0)
                                         
    Xref =  promptPoint('Place the X reference')
    Xref_value = askValue('X reference',1.0)
    
    Yref =  promptPoint('Place the Y reference')
    Yref_value = askValue('Y reference',1.0)
    
    if straight :
        
        Xref[1] = origin[1]
        Yref[0] = origin[0]
    
    ##### PROMPT THE POINTS
    
    selected_points = []
    
    tellme("Select your points !")
    print "Right-click or press 's' to select"
    print "Left-click or press 'del' to deselect"
    print "Middle-click or press 'Enter' to confirm"
    print "Note that the keyboard may not work."
    
    selected_points = plt.ginput(-1,timeout=-1)
    
    ##### RETURN THE POINTS COORDINATES
    
    #~ selected_points.sort() # sorts the points in increasing x order
    
    # compute the coordinates of the points in the user-defined system
    
    OXref = Xref - origin
    OYref = Yref - origin
    xScale =  (Xref_value - origin_value[0]) / np.linalg.norm(OXref)
    yScale =  (Yref_value - origin_value[1]) / np.linalg.norm(OYref)
    
    ux = OXref / np.linalg.norm(OXref)
    uy = OYref / np.linalg.norm(OYref)
    
    result = [(ux.dot(pt - origin) * xScale + origin_value[0],
               uy.dot(pt - origin) * yScale + origin_value[1])
               for pt in selected_points ]
    
    # copy the result to the clipboard
    
    
    clipboard.set_text('[' + '\n'.join([str(p) for p in result]) + ']')
    
    clipboard.store() # makes the data available to other applications
    
    plt.ioff()
    
    return result

Animate your 3D plots with Python’s Matplotlib

When you have a complicated 3D plot to show in a video or slideshow, it can be nice to animate it:

I obtained this surface with

import matplotlib.pyplot as plt
from matplotlib import cm
from mpl_toolkits.mplot3d import axes3d

fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
X, Y, Z = axes3d.get_test_data(0.05)
s = ax.plot_surface(X, Y, Z, cmap=cm.jet)
plt.axis('off') # remove axes for visual appeal

To animate it I created the function rotanimate that you can use like that:

import numpy as np
angles = np.linspace(0,360,21)[:-1] # A list of 20 angles between 0 and 360

# create an animated gif (20ms between frames)
rotanimate(ax, angles,'movie.gif',delay=20) 

# create a movie with 10 frames per seconds and 'quality' 2000
rotanimate(ax, angles,'movie.mp4',fps=10,bitrate=2000)

# create an ogv movie
rotanimate(ax, angles, 'movie.ogv',fps=10) 

Here is the source-code:

import matplotlib.pyplot as plt
from matplotlib import cm
from mpl_toolkits.mplot3d import axes3d
import os, sys
import numpy as np


##### TO CREATE A SERIES OF PICTURES

def make_views(ax,angles,elevation=None, width=4, height = 3,
                prefix='tmprot_',**kwargs):
    """
    Makes jpeg pictures of the given 3d ax, with different angles.
    Args:
        ax (3D axis): te ax
        angles (list): the list of angles (in degree) under which to
                       take the picture.
        width,height (float): size, in inches, of the output images.
        prefix (str): prefix for the files created. 
    
    Returns: the list of files created (for later removal)
    """
    
    files = []
    ax.figure.set_size_inches(width,height)
    
    for i,angle in enumerate(angles):
    
        ax.view_init(elev = elevation, azim=angle)
        fname = '%s%03d.jpeg'%(prefix,i)
        ax.figure.savefig(fname)
        files.append(fname)
    
    return files



##### TO TRANSFORM THE SERIES OF PICTURE INTO AN ANIMATION

def make_movie(files,output, fps=10,bitrate=1800,**kwargs):
    """
    Uses mencoder, produces a .mp4/.ogv/... movie from a list of
    picture files.
    """
    
    output_name, output_ext = os.path.splitext(output)
    command = { '.mp4' : 'mencoder "mf://%s" -mf fps=%d -o %s.mp4 -ovc lavc\
                         -lavcopts vcodec=msmpeg4v2:vbitrate=%d'
                         %(",".join(files),fps,output_name,bitrate)}
                         
    command['.ogv'] = command['.mp4'] + '; ffmpeg -i %s.mp4 -r %d %s'%(output_name,fps,output)
    
    print command[output_ext]
    output_ext = os.path.splitext(output)[1]
    os.system(command[output_ext])



def make_gif(files,output,delay=100, repeat=True,**kwargs):
    """
    Uses imageMagick to produce an animated .gif from a list of
    picture files.
    """
    
    loop = -1 if repeat else 0
    os.system('convert -delay %d -loop %d %s %s'
              %(delay,loop," ".join(files),output))




def make_strip(files,output,**kwargs):
    """
    Uses imageMagick to produce a .jpeg strip from a list of
    picture files.
    """
    
    os.system('montage -tile 1x -geometry +0+0 %s %s'%(" ".join(files),output))
    
    
    
##### MAIN FUNCTION

def rotanimate(ax, angles, output, **kwargs):
    """
    Produces an animation (.mp4,.ogv,.gif,.jpeg,.png) from a 3D plot on
    a 3D ax
    
    Args:
        ax (3D axis): the ax containing the plot of interest
        angles (list): the list of angles (in degree) under which to
                       show the plot.
        output : name of the output file. The extension determines the
                 kind of animation used.
        **kwargs:
            - width : in inches
            - heigth: in inches
            - framerate : frames per second
            - delay : delay between frames in milliseconds
            - repeat : True or False (.gif only)
    """ 
        
    output_ext = os.path.splitext(output)[1]

    files = make_views(ax,angles, **kwargs)
    
    D = { '.mp4' : make_movie,
          '.ogv' : make_movie,
          '.gif': make_gif ,
          '.jpeg': make_strip,
          '.png':make_strip}
          
    D[output_ext](files,output,**kwargs)
    
    for f in files:
        os.remove(f)
    

##### EXAMPLE

if __name__ == '__main__':

    fig = plt.figure()
    ax = fig.add_subplot(111, projection='3d')
    X, Y, Z = axes3d.get_test_data(0.05)
    s = ax.plot_surface(X, Y, Z, cmap=cm.jet)
    plt.axis('off') # remove axes for visual appeal
    
    angles = np.linspace(0,360,21)[:-1] # Take 20 angles between 0 and 360

    # create an animated gif (20ms between frames)
    rotanimate(ax, angles,'movie.gif',delay=20) 

    # create a movie with 10 frames per seconds and 'quality' 2000
    rotanimate(ax, angles,'movie.mp4',fps=10,bitrate=2000)

    # create an ogv movie
    rotanimate(ax, angles, 'movie.ogv',fps=10) 

E. coli versus the sun

I am really happy to see that the Zurich 2012 iGEM team has been designing a bacteria for sun protection. I had a similar idea a few months ago and I made a few fun slides out of it:

Bioshield – Natural Sun Protection

It seems that the team focused on sun detection and sunscreen agent production, while my concern with a bacteria-based sunscreen had been things like how to make a bacterial biofilm skin-adhesive and water-resistant. Has anyone any idea ?

Sound up your Python !

Long simulations and optimisations are a plea to the modern modelist. Worst of all are the programs of average length (say, 5 to 15 minutes) that do not let you the time to start anything else during the runs, and have you lose your day before you know it. I wrote a few Python functions to ease the pain. The first is called ring() and produces… a ring (like an egg timer ! but if you have office-mates you may use a more neutral sound like somenone coughing, or scratching, etc…). You can use it like this:

def myLongFunction():
        for i in range(100000):
            print "I'm the world's most useless function !"

myLongFunction();ring()

In case you want your function to always ring at the end, you can also use my decorator @ringatend when you define the function :

@ringatend
def myLongFunction():
        for i in range(100000):
            print "I'm the world's most useless function !"

myLongFunction() # will ring at the end, no need to call 'ring()'

Now, if you really want to make your colleagues jealous (or crazy) you can ask Python to play some music in the background while the function runs, with my decorator @inmusic

@inmusic
def myLongFunction():
   .................

myLongFunction() # will play music during the function's run !

Believe me, simulations look way shorter with on some Beny Hill (but they are actually slower 🙂 )

You can also combine the decorators to get a function that will play some music, and ring at the end:

@ringatend
@inmusic
def myLongFunction():

Here is the code for all that. You will need to install Pygame to play the sounds. I have also included some functions to splash popups: popup and @popupatend:

rom Tkinter import *
import tkMessageBox
import time
import pygame

# RING

def ring(sound='./cough.wav'):
    
    pygame.mixer.init()
    sound = pygame.mixer.Sound(sound=)
    sound.play()
    time.sleep(sound.get_length())
    pygame.mixer.quit()



def ringatend(target):
    """ a decorator to make your functions rings when they are done """
    
    def f(*args,**kwargs):
        
        result = target(*args,**kwargs)
        ring()
        return result
        
    return f



# IN MUSIC !

def inmusic(target):
    """ a decorator to play your functions in music """
            
    def f(*args,**kwargs):
        
        pygame.mixer.init()
        sound = pygame.mixer.Sound('./music.wav')
        sound.play()
        result = target(*args,**kwargs)
        sound.stop()
        pygame.mixer.quit()
        return result
            
    return f



# POPUPS

def popup(txt='Finished'):
    root = Tk()
    tkMessageBox.showinfo('Finished!',txt)
    root.destroy()
    root.mainloop()



def popupatend(target):
    
    
    def f(*args,**kwargs):
        
        # Run the function
        result = target(*args,**kwargs)
        
        # Make a pretty text
        fname = target.func_name
        pretty_args = ','.join([str(a) for a in args])
        pretty_kwargs = ','.join(['%s=%s'%(str(k),str(v))
                       for k,v in kwargs.iteritems()])  
        descr ='%s(%s,%s)'%(fname,pretty_args, pretty_kwargs)
        descr = descr[:100] # in case it really is too long
        
        # popup description
        popup(txt='%s done !'%(descr))
    
        return result
    
    return f



### EXAMPLES

@ringatend
@inmusic
def myLongFunction():
        for i in range(100000):
            print "I'm the world's most useless function !"

myLongFunction()

@popupatend
def myLongFunction():
        for i in range(100000):
            print "I'm the most useless function !"

myLongFunction()

If you want to go further with the decorators, here are versions of the decorators with options that let you choose the sound or music you want to be played.



def ringatend2(soundFile):
    """ a decorator generator using a specified soundfile """
        
    def decorator(target):
        """ a decorator using the specified soundfile """
        
        def f(*args,**kwargs):
            
            result = target(*args,**kwargs)
            ring(soundFile)
            return result
            
        return f
    
    return decorator

def inmusic2(soundFile):
    """ a decorator generator using a specified soundfile """
        
    def decorator(target):
        """ a decorator to play your functions in music """
                
        def f(*args,**kwargs):
            
            pygame.mixer.init()
            sound = pygame.mixer.Sound(soundFile)
            sound.play()
            result = target(*args,**kwargs)
            sound.stop()
            pygame.mixer.quit()
            return result
                
        return f
    
    return decorator


### EXAMPLES

@ringatend2('./tarzan.wav')
@inmusic2('./diesIrae.wav')
def myLongFunction():
        for i in range(10000):
            print "The world's most useless function !"

Piano variations on Dark Eyes (with sheet music)

A few weeks ago I made a video of swing variations on the famous russian/gipsy tune Dark Eyes.

As some people have been asking for the sheet music, here it is !
Les yeux noirs / Dark Eyes

Like my previous one it is under a Creative Common licence, which means you can do pretty much everything of it but you must cite A. de Lamarmotte as the original author. Don’t hesistate to tell me if you see any obvious mistakes !

And here is the source-code of the file. Enjoy !

% OPEN WITH LILYPOND

\version "2.12.3"

%{
*
*  L E S   Y E U X  N O I R S
*
* Version 0 - 31/03/2012
*
* This arrangement is licensed under a
* Creative Commons Attribution 3.0 Unported License.
*
* The full text of the licence can be found here:
* http://creativecommons.org/licenses/by/3.0/
* 
* In a nutshell, YOU ARE FREE to share (copy, distribute, transmit),
* to remix or adapt the work, and to make commercial use of the work. 
* 
* BUT I ask you to cite "A. de la Marmotte" as the original author
* if you do any of the things above, just so that I can see what this
* little piece is becoming.
*
* Ideas for improvements are most welcome !
*
%}

\header {
  title = " Les yeux noirs - Dark eyes - Очи чёрные"
  subtitle = "Variations by A. de Lamarmotte"
  subsubtitle = "On the famous Russian Gipsy song"
  arranger="Grenoble, France, August 2012"
  composer = "This is a free work ( licence CC BY 3.0)"
  tagline = "Work licenced under licensed a Creative Commons Attribution 3.0 Unported License"
}

% In what follows, one line represents one bar.

global = {
  \key d \minor
  \time 4/4  
}

right = \new Voice \with {
    \remove "Note_heads_engraver"
    \consists "Completion_heads_engraver"
}{
  \global
  \partial 4. {gis'8 a'4}
  <g' bes'>4. <g' a'>4. r4
  r2. \acciaccatura{gis'16}<e' a'>4
  <f' bes'>4. <d' a'>4. r4
  r4 r8 d'8 f' a' d''4
  d''2 \grace{e'16[ f' fis']} g'8 d''4 cis''8~cis''4.
  gis'8 a'8 f''8 cis''4
  e''8-^ f''16 e'' d''8 a' gis' g' f' d'
  f' g' gis' a' f''4 g''4
  f''\prall d''16 f'' g'' a'' bes''4 d''8 d'''4
  \ottava #1
  <f''' bes'''>4.-^ <g''' c''''>4-^ <f''' bes'''>
  <e''' gis'''>8 <f''' a'''>8 a'' <cis''' e'''> <d''' f'''> \ottava #0  f''
 
 <gis'' cis'''> <a'' d'''> d'' <e'' gis''> <f'' a''> a'
 
 <cis'' e''> <d'' f''> f' gis'
 \times 2/3 {a' bis' a'} g' f' e' d' cis' bes
 \change Staff = "left"
 \stemUp
 a  g f e d cis bes, a, 
 <<
    { r4 <f a b>4. <f a b>4 <f a b>8-^~<f a b>4} \\
    {d1 r4}
   >>
 \change Staff = "right"
 \stemNeutral
 a'8 gis' a' c'' b'4
 bes'4 a'8 gis' a' c'' b' bes'4
 a'8 gis' a' f'' e'' cis'' a'
 \times 2/3 {g' a' g'} e' g'  \times 2/3 {f' g' f'} d' f'
 \times 2/3 {e' f' e'} cis' e'  d' a c' b
 bes d' f' a' d' f' fis' g'
 bes' d'' e'' g' bes' d'' e'' g''
 <<
    { a''1 r2} \\
    {r4 c''8 f'' \times 2/3 {e'' f'' e''} a' e''
     \times 2/3 {d'' e'' d''} a' }
   >>
   f''4-^ g''4-^
   \times 2/3 { f''8-^ g'' f''} bes' bes'' \times 2/3 { e''-^ f'' e''} bes' bes''
   \times 2/3 { f''-^ g'' f''} bes' <e'' bes''>4 <e'' bes'' e'''>4.
   \times 2/3 { e''8-^ f'' e''} a' a'' \times 2/3 { d''-^ e'' d''} a' a''
   \times 2/3 { e''-^ f'' e''} a' <d'' a''>4 <d'' a''>4 a'8
   \times 2/3 { d'' e'' d''} cis'' c'' b' bes' a' g'
   e' g' bes' cis'' e'' g'' bes'' cis'''
   d'''4 <d'' f'' a'' b''> r4
   \clef bass
   <d f a b >
   \clef treble
   r4 <d' d''>4 <e' e''> <f' f''>
  <g'' fis'>8 g' bes' d'' e'' d'' bes' g'
  \times 2/3 { f' g' f'} e' dis' e' g' bes' gis'
  a' cis'' f'' cis'' e'' d'' a' f'
  \times 2/3 { e' f' e'} \times 2/3 { d' f' d'}
  \times 2/3 { cis' f' cis'} \times 2/3 { b f' b}
  bes d' f' bes' d'' bes' g' fis'
  f' e' <g' c''>2 <g' bes'>4
  <f' a'>4-.  <e'' gis''>8(<f'' a''>4) a'4.
  \grace{bes'16[ b'16]} c''4 <gis'' b''>8(<c''' a''>4) a'4.
  bes'8 d'' f'' fis'' g'' f'' d'' bes'
  \times 2/3 { a' bes' a'} gis' a' cis'' e'' g'' bes''
  a''4 <d'' f'' d'''>8 <d'' f'' d'''>4 <d'' f'' cis'''>8 <d'' f'' cis'''>4
  <d'' f'' c'''>8 <d'' f'' c'''>4 <d'' f'' b''>4 <d'' f'' b''>8 <d'' f'' bes''>4
  \times 2/3 { a''8 bes'' a''} g'' f'' e'' d'' cis'' bes'
  a' g' f' e' d' cis' bes a
  \time 3/4  
  \acciaccatura{<d' f a>8}<d' f a>4-^
  \acciaccatura{<d' e gis>8}<d' e gis>4-^
  \acciaccatura{<d' e g>8}<d' e g>4-^
  \time 4/4  
   <<
    {\arpeggioArrowDown <d' e f a>4\arpeggio r2.  } \\
    {r4 \times 12/20 {e16[ f a ais b e' f' a' ais' b' e'' f'' a'' ais'' b''
               \ottava #1
               e''' f''' a''' ais''' b'''] }}>>
    <d''' f''' a''' d''''>1
  
  
  \bar "|."

}

left = \new Voice \with {
    \remove "Note_heads_engraver"
    \consists "Completion_heads_engraver"
}{
  \global
  \partial 4. {r4.}
  <e g bes d'>4 <e g bes d'>4 <e g bes d'>4 <e g bes d'>4  
  <e g a cis'>4 <e g a cis'>4 <e g a cis'>4 <e g a cis'>4
  < d f a b>4 < d f a b>4 < d f a b>4 < d f a b>4
  < d f a b>4 < d f a b>4 < d f a b>4 < d f a b>4
  <e g bes d'>4 <e g bes d'>4 <e g bes d'>4 <e g bes d'>4  
  <e g a cis'>4 <e g a cis'>4 <e g a cis'>4 <e g a cis'>4
  < d f a b>4 r2.
  r1
  <g bes d' f'>2 r2
  r2
  \clef treble
  <gis'' cis'''>2
  <a'' d'''>4 r2.
  r1
  <a cis' e'>4 r2.
  \clef bass
  \stemDown
  <a, e,>4 r2.
  d,2 a,,4. d,4.
  \stemNeutral
  r4 f2
  e4 <g bes d'> d <g bes d'>
  cis <e g a> a, <e g a>
  d <f a b> a, <f a b>
  d <f a b> a, <f a b>
  c <g bes c'> d <g bes c'>
  e <g bes c'> c <g bes c'>
  f <a c' d'> e <a c' d'>
  d <a c' d'> c <a c' d'>
  <g, g> <bes d' f'> <bes, bes> <bes d' e'>
  <c c'> <bes d' f'> <cis cis'> <cis' e' g'>
  <d d'> <d' f' a'> <cis cis'> <d' f'>
  <c c'> <d' f'> <b, b> <d' f'>
  <bes, bes> <e g a> a, <e g a>
  b, <e g a> cis <e g a>
  d4.\sustainOn
  \clef treble
  <d' f' a' b'>4 <d' f' a' b'>4.
  \clef bass
  \ottava #-1
  <d,, d,>2  \sustainOff
  \ottava #0
   r2
  <g, g>4 <bes d' e'> d, <bes d' e'>
  a, <g bes d'> cis <e g a>
  d <f a b> e <f a b>
  f <d a b> a, <f a b>
  g, <f bes d'> d <f bes d'>
  c <g bes c'> e <g bes c'>
  <f a c' d'>4 r2 <f a c' d'>4
  <fis a c' d'>4 r2 <d fis a c'>4
  g4 <bes d' e'> d4 <g bes d'>
  cis <e  a g> a, <e  a g>
  <d, d>-^ <f a b> <d, d>-^ <dis, dis>-^
  <e, e>-^ <f, f>-^ <fis, fis>-^ <d, d>-^
  <g g,>4 <bes d' e'> d,4 <g bes d>
a, <e  a g> cis <e  a g>
 
  \time 3/4  
  \acciaccatura{<d f>8}<d f>4-^
  \acciaccatura{<cis e>8}<cis e>4-^
    \acciaccatura{<c ees>8}<c ees>4-^
  \time 4/4  
  <d, a, d>1\arpeggio
  \clef treble
  <d'' a''>2
  \clef bass
  <d d,>2
  \bar "|."
  
}
#(set-global-staff-size 18)

\score {
  \new PianoStaff \with {
    instrumentName = "Piano"
  } <<
    \new Staff = "right" \with {
      midiInstrument = "acoustic grand"
    } \right
    \new Staff = "left" \with {
      midiInstrument = "acoustic grand"
    } { \clef bass \left }
  >>
  \layout { }
  \midi {
    \context {
      \Score
      tempoWholesPerMinute = #(ly:make-moment 100 4)
    }
  }
}

Some like it hot

Here is an example of what I believe to be good communications skills in data representation. The Scientific Visualization Studio of the NASA Goddard Space Flight Center made this video showing 130 years of global warming:

As many of its viewers, this sure frightened me, so I went to look for the source. The map shows differences in temperature with the 1950’s. According to the NASA, the world average temperature has increased of 0.51 celcius degrees between 1951 and 2011. This surprised me a little, I would have thought it was more after watching the video ! Then it struck me that the projection of the world map (known as Mercator) gives an exagerated importance to the poles, which partly explains the amount of red in the picture. You can see how grotesquely big the antartic appears. For the artic (which isn’t reported on the map as it is not a land), it is more complicated to see, but think that Greenland is actually four times smaller than Brasil. If you have a look at the world maps you can find on the net you will see that some other solutions exist, and that this projection was a deliberate choice of the vizualization studio.

Now, how big is the distorsion ? I wrote a small script to analyse the 2011 map and found out that the average temperature increase as it appears on the picture is of +0.81°C, which is about 60% higher than the 0.51°C announced on the NASA’s website !

My methods

In case you are interested, I did the analysis with Python. I first took a screenshot of the 2011 frame in high definition, then I extracted a one-pixel high rectangle of the color bar. Finally I painted in black the zones I didn’t want to be considered (country borders, year, colorbar):

Then I used the following Python script (which requires Scipy and Pylab):

from scipy import ndimage
from scipy.interpolate import NearestNDInterpolator
from pylab import *

# load the color bar and the map
colorScale = ndimage.imread('colorScale.bmp')[0]
map2011 = ndimage.imread('map2011.bmp')


# create a function that will attribute a value to a color by looking
# at the nearest color in the colorbar

scaleValues = linspace(-2,2,len(colorScale)) # scale of the colorbar
colorInterpolator = NearestNDInterpolator(colorScale, scaleValues)
color2value = lambda color :  colorInterpolator([color])[0]

# compute the corresponding value and the norm of the colors of all
# points of the map

values = apply_along_axis(color2value, 2,map2011)
norms = apply_along_axis(norm, 2,map2011)

# compute the mean of the values on the non-black pixels

print values[norms != 0].mean()