Finding a subnetwork with a given topology in E. Coli

A few weeks ago I posted about how easy it was, using Python’s Networkx module and the RegulonDB database, to get and navigate through the genetic network of E. Coli. I didn’t mention at the time that you can also find all the subnetworks of E. Coli sharing a given topology with just a few lines of code:

import networkx.algorithms.isomorphism as iso
def find_pattern(graph,pattern,sign_sensitive = True):

    if sign_sensitive:
        edge_match = lambda e1,e2 : (e1['sign']==e2['sign'])
    else:
        edge_match = None
    matcher = iso.DiGraphMatcher(graph,pattern,edge_match=edge_match)
    return list(matcher.subgraph_isomorphisms())

Feedforward loops in E. coli

As an example, let us have a look at the feedforward loops in E. coli. A feedforward loop can be described as a gene having an action on another gene, both directly and through the intermediary of another gene, like in the following sketch, where the arrows can represent activations or repressions

A feedforward !

If you want to print all the feedforward loops in E. coli’s network (as represented by RegulonDB), try this :

import networkx as nx
feedforward = nx.DiGraph()
feedforward.add_edges_from([('A','I'),('I','C'),('A','C')])
ffwd_in_ecoli = find_pattern(ECN,feedforward,sign_sensitive=False)
for subgraph in ffwd_in_ecoli:
    print subgraph

This prints each feedforward loop with the respective roles of the different genes (A,I, or C). As you can see there are 63 feedforward loops in regulonDB’s network, which is many. Let us now get sign-specific :

import itertools as itt
labels = []
frequencies = []
for signs in itt.product(('+','-'),repeat = 3):

    ffwd = nx.DiGraph()
    ffwd.add_edges_from([('A','I',{'sign':signs[0]}),
                         ('I','B',{'sign':signs[1]}),
                         ('A','B',{'sign':signs[2]})])
    frequency = len(list(find_pattern(ECN,ffwd,True)))
    frequencies.append(frequency)
    labels.append(" ".join(signs))

frequencies = array([float(f) for f in frequencies])/sum(frequencies)
fig,ax = subplots(1, figsize=(5,5))
title('feedforward loops in E. coli, \n by signs of A->I,I->B,A->B')
pie(frequencies, labels=labels, autopct='%1.1f%%', shadow=True)
show()

Relative frequencies of the different feedforward loops in E. coli

I’m not the first one to do such reasearch, but that won’t prevent me from commenting 🙂 The most frequent pattern can be seen as a double repression : gene A represses gene B, and, as an additional punishment, also represses the activator I of B. Such feedforwards are called coherent, as the two actions of the gene A have the same effect on the gene B. The second most frequent feedforward is incoherent: gene A activates gene B while activating a repressor of B, which seems a little foolish, but can be a way of creating bumps of activity (B is awaken just a few minutes before its repressor puts it down again).

A word of caution

Note that the algorithm didn’t find any feedforward loop with activations only (+++), while some have been reported in the literature. The same way, I couldn’t find any mutual interaction between two genes (A has an action on B and B has an action on A). So if you are going to use regulonDB’s network of interactions for reasearch purposes, always have in mind :

  • The database is INCOMPLETE, meaning that you can use it (carefully) to deduce the existence of stuff, but not to prove the absence of something. In particular, many gene interactions that occur through the intermediary of metabolites are omited ! For instance, the gene cyaA produces the cyclic AMP that is necessary to activate the genes regulated by crp. However only crp appears as a regulator of these genes.
  • In the context of synthetic biology, the few genes added to the bacteria are often supposed to work independently of the rest of the bacteria (some biologists say that they are orthogonal to the rest of the system). Thus, if you design a feedforward loop, it will be a feedforward loop and nothing else. But keep in mind that when the pattern-matching algorithm finds a feedforward loop in E. coli’s networks, each gene of the triangle could also be under the influence of many other, and maybe the behavior of the circuit is not what could be inferred by looking only at these three genes.
  • The algorithm consider isomorphism between subnetworks, which implies that it won’t mind if the subnetwork is under the influence of external genes, but it will mind if the subnetwork has auto-regulations that do not appear on the provided pattern. For instance, fis regulates crp and vice-versa, but this won’t be reported if you simply look for the pattern A\leftrightarrow B because, in addition, fis and crp regulate themselves !
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s