ESIL :: 1ère année :: Module "Bioinformatique" :: année 2012/2013 :: Jacques van Helden

Session 4: Phylogenetic inference


Contents


Introduction


Resources

Name Category Description
Taxonomy browser Taxonomy http://www.ncbi.nlm.nih.gov/Taxonomy/CommonTree/wwwcmt.cgi
Phylogeny.fr Phylogenic inference Robust Phylogenetic Analysis For The Non-Specialist.
http://phylogeny.lirmm.fr/
http://www.phylogeny.fr/
Ensembl genome browser Genome browser Include a dynamic tree representation for each gene family.
http://www.ensembl.org/

[back to contents]

Inferring the phylogeny of opsins in Mammals

Context

Goals of this exercise

Tips

  1. The absence of a protein in a database may result from many reasons (genome not completely sequenced, sequencing holes, incompleteness/errors in the annotations). It is thus generally not sufficient to indicate its absence in the considered organism.
  2. After you ran a query in Uniprot, the head of the result page displays an option that allows you to restrict the resutls to complete proteomes (i.e. full sets of proteic sequences obtained by translating all the coding genes identified in completely sequenced genomes).

Questions

  1. In Uniprot, find all the opsins for the Mammals whose genome has been completely sequenced.
  2. Select a subset of 5 species, and extract their sequences in fasta format.
  3. Submit the sequences to the Phylogeny.fr Web server.
  4. How do you interpret the result ? Can you infer an evolutionary scenario from the resulting tree ? How does it compare with the conclusions drawn from the practical on pairwise sequence comparisons ?
  5. Evaluate the robustness of the inferred tree by inspecting the bootstrap values. Evaluate which branches seem robust and which ones are more questionable.
  6. The "One click" button is very convenient, but does not guarantee you the best selection of options for your particular problem. Explore the options for the successive steps of the inference (alignment, scoring, tree building) by re-doing the analysis with the form "A la carte". Try to identify the parameters giving the most reliable tree, in term of bootstrap values and in terms of biological consistence.
View solution| Hide solution [back to contents]

Exercise 2: getting the gene family tree at Ensembl

Context

Goals of this exercise

The Ensembl database allows you to browse pre-computed tree for each gene family. Phylogenies were inferred by reconciliating the moleclar tree and species tree.

Questions

  1. In the Ensembl genome browser, identify the gene coding for human green-sensitive opsin, and display the gene tree image. Ty to infer the evolutionary scenario for the appearance of trichromatic vision.
  2. In the UCSC genome browser, analyze the genomic region containing the genes coding for the human red- and green-sensitive opsins. Try to identify the homolog genomic region in primate and non-primate Mammals. In which species to you observe two or more opsin-coding genes in tandem ?
  3. In the ECR browser, analyze the region encompassing the genes OPN1MW, OPN1MW and OPN1LW. Does the browser display orthologous regions for non-primate mammalian (for example mouse) ?
  4. Open a new connection to the ECR browser, and display the alignments of mammalian genomes against the mouse region containing the gene Opn1mw. Do you observe several opsin-coding genes ?
  5. Summarize your observations
View solution| Hide solution
Jacques van Helden (TAGC, Aix-Marseille Université).