variation-scan
$program_version
Scan variant sequences with position specific scoring matrices (PSSM) and report variations that affect the binding score, in order to predict regulatory variants.
variation-scan [-i sequence_file] -m matrix_file -bg backgournd_file [-calc_distrib] [-o outputfile] [-v #] [...]
variation-scan takes as input a variation file in the format produced by retrieve-variation-seq. for details about this format, see retrieve-variation-seq output format.
A list of matrix in transfanc format
oligo-analysis format
A tab delimited file with the following column content.
Name of the matrice
Name of the variation
SO term of the variation.
Coordinate of the variation.
Best max weigth.
Worst max weigth.
Difference between the two max weigth.
Variant of the variation in the sequence.
Pvalue of the best max weigth.
Pvalue of the worst max weigth.
item 11. sigma
Log10 difference between the two p-value.
item 12. B_var
Variant(s) in the sequence with the best max weigth.
Multiple variant are return comma separate if the highest max weigth is the same in multiple sequence.
item 13. W_var
Variant(s) in the sequence with the worst max weigth.
Multiple variant are return comma separate if the lowest max weigth is the same in multiple sequence.
item 14. B_offset
item 15. W_offset
item 14. B_seq
Sequence with the highest max weigth.
Multiple sequence are return comma separate if the best max weigth is the same in multiple sequence.
item 15. W_seq
Sequence with the lowest max weigth.
Multiple sequence are return comma separate if the worst max weigth is the same in multiple sequence.
retrieve-variation-seq uses the sequences downloaded from Ensembl using the tool download-ensembl-genome.
retrieve-variation-seq uses variation coordinates downloaded from Ensembl using the tool download-ensembl-variations.
Scan variation sequences with one or several position-specific scoring matrices.
Level of verbosity (detail in the warning messages during execution)
Display full help message
Same as -h
Variation file RSAT format
The matrix file transfac format
Background file
Input File
Length of the longest Matrix, this values has to be consistent with the one used io for retrieving the variant sequences (see <retrieve-variation-seq>).
Only work with the # top matrix
Only work with the # top variation
Only return rvar with type_score > #
Convert the tab-delimited file into an HTML file, which facilitates the inspection of the results with a Web browser. The HTML file has the same name as the output file, but the extension (.tab, .txt) is replaced by the .html extension
Calculate and save distribution of matrices
Directory to store the distribution files. Mandatory if -calc_distrib is being used.
Name of the file containing the list of matrix distrib file name
/!\ This file must be in the same directory as the distrib file
Only return the biggest difference of score between two alleles of a variation regarthless of the window, this option is usefull for insertions and deletions
The output file is in fasta format.
If no output file is specified, the standard output is used. This allows to use the command within a pipe.