Genomics and proteomics provide a variety of information that might help us to understand the function of genes and their products, by taking into account their own sequence, but also their genomic context and their phylogenetic conservation.
Starting from a protein of interest (e.g. an enzyme from the baker’s yeast Saccharomyces cerevisiae), we will combine a variety of bioinformatics tools to understand its function and evolution.
| Name | URL | description |
|---|---|---|
| Uniprot | http://uniprot.org | A database of protein sequences and functional information |
| STRING | http://string-db.org/ | A database of protein interactions based on 6 types of experimental or bioinformatics evidences |
| MetaCyc | http://metacyc.org | A database of metabolic pathways |
| RSAT | http://rsat.eu/ | Regulatory sequence Analysis Tools |
| NeAT | http://neat.rsat.eu/ | Network Analysis Tools |
What is the function of the gene MET17? Describe in 2 sentences its molecular activity (what its product does) and the context in which this activity takes place (surrounding reactions, substrates, products, …).
Answer briefly (~ one sentence per question).
Keep the STRING result page open, you might ned to come back to it for future exercises.
The STRING database allowed us to detect a set of genes functionnally linked to our query gene, which encompasses some of their close metabolic neighbours (genes involved in the same pathway) plus some additional genes. In order to understand the link between these genes, we can map them onto metabolic maps from the KEGG dtabase.
The STRING database includes functional interactions inferred from the fact that two genes show correlated expression profiles in transcriptome analyses (co-expression network).
In the previous step, we attempted to discover motifs over-represented in the promoters of a set of co-expressed genes. The approach relied on the idea that these motifs are over-represented in these promoters altogether, because their co-expression may rely on their co-regulation by a common transcription factor.
We now dispose of several hundreds of fungal genomes, and tens of thousands of bacterial genomes, which opens the perspective of applying a much more powerful approach to discover cis-regulatory motifs, based on their conservation across promoters of orthologous genes (phylogenetic footprints).
We will use the RSAT tool *footprint-discovery** to discover conserved motifs separately for each of the functional neighbours of the query gene (irrespective of their co-expression status).
Warning: The footprint discovery approach takes more time than the other RSAT analyses, because it requires to collect promoters of orthologues in many species. This may be the good moment to take a coffee break, or to already start the next steps while the server proceeds.
Contact: Jacques.van-Helden@univ-amu.fr