This directory contains the supplementary material for the article

van Helden, J., Olmo, M. & Perez-Ortin, J. E. (2000). Statistical analysis of yeast genomic downstream sequences reveals putative polyadenylation signals. Nucleic Acids Res 28(4), 1000-1010. Pubmed 10648794

Data files

data/EST_-150_+80.fasta.gz Dataset from Joel Graber. 230bp sequences around the cleavage site
data/all_down200_noorf.fasta.gz downstream sequences for all yeast genes, 200bp from the stop codon (non included)
When an ORF is found closer than 200bp, sequence is clipped to avoid including coding sequences
data/all_down200_noorf_sizes.jpg Histogram of downstream sequence sizes
data/all_up200_noorf.fasta.gz upstream sequences for all yeast genes, 200bp from the start codon (non included)
When an ORF is found closer than 200bp, sequence is clipped to avoid including coding sequences
data/subsets Subsets of genes (genes without homologs, genes with identified cleavage site, ...)
data/yeast_non_coding_segments_convergent.fasta.gz Intergenic sequences between a pair of convergently transcribed genes
data/yeast_non_coding_segments_divergent.fasta.gz Intergenic sequences between a pair of divergently transcribed genes
data/yeast_non_coding_segments_parallel.fasta.gz Intergenic sequences between a pair of tandem genes (transcribed in the same direction)
data/all_downstream All yeast downstream sequences, over various ranges from the stop codon