This tutorial is an application of concepts seen in the following
chapters of the course:

This tutorial assumes you already executed the
script `config.R` as described in the configuration page.
## Tutorial

#### Question

*
*Microarrays were used to measure the level of expression of all the
yeast genes in two different culture media: (1) minimal medium
(measured in the green channel of the microarray); (2) minimal medium
+ methionine (measured in the red channel of the microarray). Three
repetitions of the experiment were performed, and the log-ratios
log10(Red/Green) were calculated for each microarray. For a given
gene, we obtain the following values of log-ratio: 2.0, 3.1, 0.3.

- Is this gene significantly activated by methionine ?
- How many false positives would we expect with this level of
significance, if the test was applied on 6200 genes ?

## Descriptive statistics
## The sample is stored in a vector called x
x <- c(2.0, 3.1, 0.3)
print(x) ## Check the sample
## Calculate sample size
n <- length(x)
print(n) ## Check sample size
## Calculate the sample mean
sample.mean <- mean(x)
## Calcualte the standard deviation of the sample
## This way to calculate is inefficient, it is just shown for didactic purpose
sample.var <- mean((x - sample.mean)^2)
sample.sd <- sqrt(sample.var)
print(sample.sd)
## Estimate the standard deviation of the population
## This can be done by applying the correction on the sample standard deviation
print(sample.sd*sqrt(n/(n-1)))
## Faster way: use the R function sd(), which automatically performs
## the n/(n-1) correction
sd.est <- sd(x)
print(sd.est)
## Calculate standard error
print(standard.error <- sd.est/sqrt(n))
## Calculate the observed Student statistics t.obs
ref.mean <- 0
t.obs <- (sample.mean - ref.mean )/standard.error
print(t.obs)
## Draw the histogram of the Student theoretical functions, and compare them to the normal distribution
y <- seq(from=-5,to=5,by=0.1)
## Draw the normal distribution
plot(y, dnorm(y), tpye="l", col="darkblue", type="l",panel.first=grid(col="black"))
## help(dt)
i <- 0
for (d in c(1,2,3,4,5,10,100,1000)) {
i <- i+1
lines(y, dt(y,df=i),type="l",col=i)
}
## Calculate the P-value of t.obs
P.value <- pt(t.obs,df=n-1,lower.tail=F)
print(P.value)
## E-value
G <- 6200
E.value <- P.value*G
print(E.value)
## T.test in R
t.test(x,alternative="greater")