Practical - Graph-based analysis of metabolism

Introduction

Metabolic pathway databases

Exercise

The goal of this exercise is to get familiar with two major metabolic databases, their data structure, level of annotation and querying interfaces.

  1. Open a connection to the EcoCyc database (http://www.ecocyc.org).
  2. In a separate window, open a connection to the KEGG metabolic pathway database
  3. Compare the annotations in EcoCyc and KEGG for the following pathways in Escherichia coli K12:
    • Lysine biosynthesis
    • Proline utilization
    • Proline biosynthesis
  4. In KEGG, compare the lysing pathways annotated for Escherichia coli K12 and Saccharomyces cerevisiae, respectively.

Metabolic path finding

Exercise

The goal of this exercise is to test the accuracy of k-shortest (or k-lightest) path finding algorithms to detect linear metabolic pathways.

  1. Open a connection to the EcoCyc database (http://www.ecocyc.org).

  2. Choose a pathway for testing. Since this is an exercise on linear path finding, the selected pathway should include a sufficient number of enzymes connected in alinear way (if you are short of inspiration, you can take Lysine biosynthesis).

  3. In the chosen pathway, identify the main substrate of the first reaction, and the main product of the last reaction.

  4. In a separate window, open a connection to the KEGG metabolic pathway database

  5. Open the reference map corresponding to the pathway you choose above.

  6. Color this reference map with Escherichia coli K12 enzymes.

  7. In a third window, open a connection to the Network Analysis Tools (http://neat.rsat.eu/).

  8. In the menu on the left frame, open the Metabolic path finding tool.
  9. Select select Reaction network as Graph type.
  10. Make sure that the weighting scheme includes the weight on Compounds.
  11. Find the lightest path betwen the initial substrate and the last product of your pathway.
  12. Compare the results of path finding with the annotated paythways. Count the numbers of
    • True positives (TP): reactions in the path finding result that belong to the annotated pathway.
    • False positives (FP): reactions in the path finding result that do not belong to the annotated pathway.
    • False negatives (FN): reactions of the annotated pathway that do not appear in the path finding result.

Metabolic subgraph extraction

By applying a very rudimentary distance-based method, we predicted operons in the whole genome of Escherichia coli K12.

It is well known that operons usually regroup genes involved in a same biological process. Some of the predicted operons are likely to contain sets of enzymes involved in the same metabolic pathway.

The goal of this exercise is to use the metabolic subgraph extraction algorithm developed by Karoline Faust in order to predict metabolic pathways from predicted operons, and to compare the predicted pathways with those annotated in the KEGG and EcoCyc databases. .

  1. Open a connection to the Regulatory Sequence Analysis Tools (RSAT).

  2. Open the tool infer operons, select

    • Organism Escherichia coli K 12 substr MG1655 uid57778.
    • All genes
    • Click GO
    • Sort the result table by gene number (click on the corresponding column in the header of the table).

  3. Select an operon containing at least 5 genes, and for which at least some gene names correspond to enzymes (you can check it with the RSAT tool gene information.

  4. Open a connection to the Network analysis tools, and in the list tool on the left, click on Pathway extraction and select the following parameters

    • Preloaded network: MetaCyc network version 14.1
    • Enter the genes of your operon in the box Seed nodes.
    • Identifier type: Genes/Enzymes
    • Metacyc network Escherichia coli K-12 substr MG1655
    • Click GO

  5. Analyze and interpret the result