If you want to transform a picture of a graph into exploitable data (which is very useful in science if you want to exploit a figure from an article without bothering the authors), here is a minimalistic interface written in python with the following features:
- Data extraction from picture files or from a picture in the clipboard.
- Data extraction from rotated graphs or graphs shown with (moderate) perspective.
- Advanced interface (left-click to select a point, right-click to deselect).
- Stores the points’ coordinates in a python variable and in the clipboard (for use in another application).
You can launch the interface with
points = pic2data()
This will either start a session using the picture from the clipboard, or , if there is none, wait for the clipboard to contain a picture. Alternatively you can use a picture from a file with
points = pic2data('graph.jpeg')
You will then be asked you to place the origin of the graph, as well as the coordinates of this origin (in case it it not (0,0)), and one reference point for each axis X and Y (i.e. points of these axis whose coordinates you know). Then you can select/deselect as many points of the curve as you want, and exit with the middle button.The list of selected points [(x1,y1),(x2,y2),...] is returned.
By default the program will consider that the graph is rectangular and parralel to the edges of the pictures (wich I will call straight in what follows). This will typically be the case for a graph from a scientific article. As a consequence the algorithm will automatically replace the reference point you chose for the X axis in order to put it at the same height as the origin, and it will replace the reference point for Y exactly above the origin. However if the graph on the picture is not straight, like in a photo, use the argument straight=False.
As an example, let us take a photo with a graph, like this one.
As the graph is not straight we will use
points = pic2data('mozart.jpeg', straight = False)
Which gets you to that:
After placing the points and getting their coordinates one can redraw the plot with
from pylab import * figure() x,y = zip(*points) plot(x,y,'o') show()
And voilà !
Here is the code. Happy curving !
from urlparse import urlparse import pygtk import gtk import tkSimpleDialog import matplotlib.image as mpimg import matplotlib.pyplot as plt import numpy as np def tellme(s): print s plt.title(s,fontsize=16) plt.draw() def pic2data(source='clipboard',straight=True): """ GUI to get data from a XY graph image. Either provide the graph as a path to an image in 'source' or copy it to the clipboard. """ ##### GET THE IMAGE clipboard = gtk.clipboard_get() if source=='clipboard': # This chunk tries the text content of the clipboard # and empties it if it is not a file path print "Waiting for an image in the clipboard..." while not ( clipboard.wait_is_uris_available() or clipboard.wait_is_image_available()): pass if clipboard.wait_is_uris_available(): # it's a path to a file ! source = clipboard.wait_for_uris() source = urlparse(source).path return pic2data(source) image = clipboard.wait_for_image().get_pixels_array() origin = 'upper' else: # source is a path to a file ! image = mpimg.imread(source) origin = 'lower' ###### DISPLAY THE IMAGE plt.ion() # interactive mode ! fig, ax = plt.subplots(1) imgplot = ax.imshow(image, origin=origin) fig.canvas.draw() plt.draw() ##### PROMPT THE AXES def promptPoint(text=None): if text is not None: tellme(text) return np.array(plt.ginput(1,timeout=-1)) def askValue(text='',initialvalue=0.0): return tkSimpleDialog.askfloat(text, 'Value:', initialvalue=initialvalue) origin = promptPoint('Place the origin') origin_value = askValue('X origin',0),askValue('Y origin',0) Xref = promptPoint('Place the X reference') Xref_value = askValue('X reference',1.0) Yref = promptPoint('Place the Y reference') Yref_value = askValue('Y reference',1.0) if straight : Xref = origin Yref = origin ##### PROMPT THE POINTS selected_points =  tellme("Select your points !") print "Right-click or press 's' to select" print "Left-click or press 'del' to deselect" print "Middle-click or press 'Enter' to confirm" print "Note that the keyboard may not work." selected_points = plt.ginput(-1,timeout=-1) ##### RETURN THE POINTS COORDINATES #~ selected_points.sort() # sorts the points in increasing x order # compute the coordinates of the points in the user-defined system OXref = Xref - origin OYref = Yref - origin xScale = (Xref_value - origin_value) / np.linalg.norm(OXref) yScale = (Yref_value - origin_value) / np.linalg.norm(OYref) ux = OXref / np.linalg.norm(OXref) uy = OYref / np.linalg.norm(OYref) result = [(ux.dot(pt - origin) * xScale + origin_value, uy.dot(pt - origin) * yScale + origin_value) for pt in selected_points ] # copy the result to the clipboard clipboard.set_text('[' + '\n'.join([str(p) for p in result]) + ']') clipboard.store() # makes the data available to other applications plt.ioff() return result