The goal of this exercise is to get familiar with two major metabolic databases, their data structure, level of annotation and querying interfaces.
The goal of this exercise is to test the accuracy of k-shortest (or k-lightest) path finding algorithms to detect linear metabolic pathways.
Open a connection to the EcoCyc database (http://www.ecocyc.org).
Choose a pathway for testing. Since this is an exercise on linear path finding, the selected pathway should include a sufficient number of enzymes connected in alinear way (if you are short of inspiration, you can take Lysine biosynthesis).
In the chosen pathway, identify the main substrate of the first reaction, and the main product of the last reaction.
In a separate window, open a connection to the KEGG metabolic pathway database
Open the reference map corresponding to the pathway you choose above.
Color this reference map with Escherichia coli K12 enzymes.
In a third window, open a connection to the Network Analysis Tools (http://neat.rsat.eu/).
By applying a very rudimentary distance-based method, we predicted operons in the whole genome of Escherichia coli K12.
It is well known that operons usually regroup genes involved in a same biological process. Some of the predicted operons are likely to contain sets of enzymes involved in the same metabolic pathway.
The goal of this exercise is to use the metabolic subgraph extraction algorithm developed by Karoline Faust in order to predict metabolic pathways from predicted operons, and to compare the predicted pathways with those annotated in the KEGG and EcoCyc databases. .
Open a connection to the Regulatory Sequence Analysis Tools (RSAT).
Open the tool infer operons, select
Select an operon containing at least 5 genes, and for which at least some gene names correspond to enzymes (you can check it with the RSAT tool gene information.
Open a connection to the Network analysis tools, and in the list tool on the left, click on Pathway extraction and select the following parameters
Analyze and interpret the result