Abbrev | Name |
---|---|

DP | Denis Puthier |

JvH | Jacques van Helden |

2017-2018

- 2ème année du Master en Bioinformatique, biochimie structurale et génomique (BBSG).
- Ecole doctorale

Tool | About |
---|---|

R | A free software environment for statistical computing and graphics |

R markdown | Documentation for R markdown language (used for the practicals) |

Bioconductor | A set of R libraries dedicated to statistical analysis of genomics data. |

MeV: MultiExperiment Viewer | A Java application designed to allow the analysis of expression data |

Cluster 3.0 | Implements the most commonly used clustering methods for gene expression |

java Treeview | Java-based tool to visualize trees prodced by hierarchical clustering togeter with a heatmap with expression proviles. |

Students are expected to have followed the introduction to statistics in the first year of the master.

- Probabilités et statistiques pour la biologie (SBBAU16L - STAT1): https://jvanheld.github.io/stat1/

We assume that the following concepts are acquired.

- Discrete distributions (geometric, binomial)
- Sampling and estimation
- Mean comparison tests (Student, Welch)

A basic knowledge of the R language is expected.

- handling of variables and data frames (“tables”)
- distributions of probability
- drawing (histograms, dot plots)
- test of hypothesis

If you did not receive any training.

During the first practical we will briefly revise these concepts and practical skills.

Date | From | To | Subject | Teacher | Concepts | Material |
---|---|---|---|---|---|---|

3/11 | 14:00 | 18:00 | Introduction au cours | JvH | Diapos html Rmd | |

3/11 6/11 |
14:00 | 18:00 | Detecting differentially expressed genes (DEG) with microrarrays | DP | - Hypothesis testing - Student \(t\) statistics - Unbiased estimation of variance - MA plot - Volcano plots -P-value distribution - E-value |
- Slides: html Rmd - Basics about Student and Welch’s t test html Rmd - Practical: detecting differentially expressed genes in microarray data html Rmd - Practical: generating random control sets following a Normal distribution (incomplete) html Rmd |

7/11 | 14:00 | 18:00 | About distances and clustering | DP | - Distance metrics - Hierarchical clustering |
- Theory : html - Distance metrics and clustering (practical) html rmd |

20/11 | 9:00 | 12:30 | The multiple ways to correct for multiple testing | JvH | - False positive risk (FPR) - Expected number of false positives (E-value) - Family-Wise Error Rate (FWER) - False Discovery Rate (FDR) |
- Multiple testing corrections (slides) - Multiple testing corrections (practical) |

20/11 | 9:00 | 12:30 | Supervised classification | JvH | - Discriminant analysis - Cross-validation (k-fold, LOO) - Data dimensionality and overfitting - Variable selection |
- Introduction to multivariate analysis - Discriminant analysis (slides) - Dimension reduction and PCA - Practical: supervised classification |

21/11 | 14:00 | 18:00 | Functional enrichment of DEG | DP | - Functional enrichment statistics - The hypergeometric distribution |
- Theory : on white board - Hypergeometric distribution and enrichment statistics. An example application: DAVID (practical) |

22/11 | 14:00 | 18:00 | Overview of discrete distributions, with applications to NGS | DP | - Geometric - Binomial - Poisson -Hypergeometric - Negative binomial |
Tutorial [html][pdf][Rmd] |

22/11 | 14:00 | 18:00 | Detecting differentially expressed genes (DEG) with RNA-seq | DP/JvH | - RNA-Seq principles - Normalizing RNA-seq counts -Detecting differentially expressed genes (DEG) |
- RNA-Seq DEG with DESeq2 [html] [pdf] [Rmd] |

22/11 | 14:00 | 18:00 | Descriptive statistics with ggplot2 | DP | - ggplot2 principles - Layout, creating diagrams |
- Introduction to ggplot2 [html] |

RNA-seq analysis (pursued) | JvH | [] [] |