Some like it hot

Here is an example of what I believe to be good communications skills in data representation. The Scientific Visualization Studio of the NASA Goddard Space Flight Center made this video showing 130 years of global warming:

As many of its viewers, this sure frightened me, so I went to look for the source. The map shows differences in temperature with the 1950’s. According to the NASA, the world average temperature has increased of 0.51 celcius degrees between 1951 and 2011. This surprised me a little, I would have thought it was more after watching the video ! Then it struck me that the projection of the world map (known as Mercator) gives an exagerated importance to the poles, which partly explains the amount of red in the picture. You can see how grotesquely big the antartic appears. For the artic (which isn’t reported on the map as it is not a land), it is more complicated to see, but think that Greenland is actually four times smaller than Brasil. If you have a look at the world maps you can find on the net you will see that some other solutions exist, and that this projection was a deliberate choice of the vizualization studio.

Now, how big is the distorsion ? I wrote a small script to analyse the 2011 map and found out that the average temperature increase as it appears on the picture is of +0.81°C, which is about 60% higher than the 0.51°C announced on the NASA’s website !

My methods

In case you are interested, I did the analysis with Python. I first took a screenshot of the 2011 frame in high definition, then I extracted a one-pixel high rectangle of the color bar. Finally I painted in black the zones I didn’t want to be considered (country borders, year, colorbar):

Then I used the following Python script (which requires Scipy and Pylab):

from scipy import ndimage
from scipy.interpolate import NearestNDInterpolator
from pylab import *

# load the color bar and the map
colorScale = ndimage.imread('colorScale.bmp')[0]
map2011 = ndimage.imread('map2011.bmp')

# create a function that will attribute a value to a color by looking
# at the nearest color in the colorbar

scaleValues = linspace(-2,2,len(colorScale)) # scale of the colorbar
colorInterpolator = NearestNDInterpolator(colorScale, scaleValues)
color2value = lambda color :  colorInterpolator([color])[0]

# compute the corresponding value and the norm of the colors of all
# points of the map

values = apply_along_axis(color2value, 2,map2011)
norms = apply_along_axis(norm, 2,map2011)

# compute the mean of the values on the non-black pixels

print values[norms != 0].mean()