Pattern discovery in yeast downstream sequences

Supplementary material for the article :

Data

Sequences

Downstream sequences Positions relative to stop codon: position +1 is the first nucleotide after the coding sequence, negative coordinates indicate coding sequences.
Intergenic segments All yeast intergenic sequences, separated in 3 subsets:
  • tandem : interrgenic sequences between two genes transcribed in the same direction
  • divergent: intergenic sequences between the start codon of two genes transcribed in opposite and divergent direction
  • convergent: intergenic sequences between the stop codons of two genes transcribed in opposite and converent direction
Sequences around cleavage sites Sequences kindly provided by Joel Graber. Sequences around cleavage sites, located by mapping EST segments onto the genomic sequences. See Graber's paper for details.
    Graber JH, Cantor CR, Mohr SC, Smith TF. Genomic detection of new yeast pre-mRNA 3'-end-processing signals. Nucleic Acids Res. 1999 Feb 1;27(3):888-94. Pubmed 9889288
Subsets Selections of sequences according to various criteria (see paper for details)

Results

Discovered oligonucleotides

Analysis of all downstream sequences
Over-represented oligos Statistical estimation of oligo over-representation, using a Markov chain background model.
Positionally biased oligos Detection of oligonucleotides having a positional bias, e.g. a distribution with peaks of valleys.
Analysis of all sequences around cleavage sites
Over-represented oligos Statistical estimation of oligo over-representation, using a Markov chain background model.
Positionally biased oligos Detection of oligonucleotides having a positional bias, e.g. a distribution with peaks of valleys.

The main files are indicated above. The other files are not indexed, but their name generally indicates their content.

In case of trouble, please contact Jacques van Helden