# 21 Statistics 2

Candidates will be expected to be familiar with the knowledge, skills and understanding implicit in the module Statistics 1 and Core 1.

Candidates may use relevant formulae included in the formulae booklet without proof.

Candidates should learn the following formulae, which are not included in the formulae booklet, but which may be required to answer questions.

$\mbox{E}(aX + b) = a\mbox{E}(X) + b \space \space$ and $\space \space \text{Var}(aX) = a^2\text{Var}(X)$

$\mbox{P}(a < X < b) = \int_a^b \mbox{f}(x) \mbox{d}$

$\mbox{P}(\text{Type I error}) = \mbox{P}(\text{reject}\space\mbox{H}_0\space|\space\mbox{H}_0\space\text{true})\space$ and

$\mbox{P}(\text{Type II error}) = \mbox{P}(\text{accept}\space\mbox{H}_0\space|\space\mbox{H}_0\space\text{false})$

$\displaystyle E_{ij} = \frac{R_i \times C_j}{T}$ and   $\nu = (\text{rows} - 1)(\text{columns} - 1)$

Yates’ correction (for table) is $\space \displaystyle \chi^2 = \sum\frac{(|O_i - E_i| - 0.5)^2}{E_i}$

 Discrete random variables and their associated probability distributions. The number of possible outcomes will be finite. Distributions will be given or easily determined in the form of a table or simple function. Mean, variance and standard deviation. Knowledge of the formulae $\mbox{E}(X) = \displaystyle\sum x_i p_i\space,\space\mbox{E}\big(\mbox{g}(X)\big) = \sum \mbox{g}(x_i)p_i\space,\space\text{Var}(X) = \mbox{E}(X^2) - (\mbox{E}(X))^2\space,\space$ $\mbox{E}(aX+b) = a\mbox{E}(X) + b \space$ and $\space \text{Var}(aX+b) = a^2 \text{Var}(X)$ will be expected. Mean, variance and standard deviation of a simple function of a discrete random variable. Eg   $\mbox{E}(2X + 3) \space,\space \mbox{E}(5X^2) \space, \space \mbox{E}(10X^{-1})\space,\space \mbox{E}(100X^{-2})$ Eg   .
 Conditions for application of a Poisson distribution. Calculation of probabilities using formula. To include calculation of values of $\space \mathrm{e}^{- \lambda}$ from a calculator. Use of Tables. Mean, variance and standard deviation of a Poisson distribution. Knowledge, but not derivations, will be required. Distribution of sum of independent Poisson distributions. Result, not proof.
 Differences from discrete random variables. Probability density functions, cumulative distribution functions and their relationship. $\displaystyle \mbox{F}(x) = \int_{- \infty}^x \mathrm{f}(t)\mbox{d}t \space\space$ and $\space \space \displaystyle \space\mathrm{f}(x) = \frac{\mbox{d}}{\mbox{d}\!x}\mbox{F}(x)$. Polynomial integration only. The probability of an observation lying in a specified interval. $\displaystyle \mbox{P}(a < X < b) = \int_a^b \mbox{f}(x) \mbox{d}x \space \space$ and $\space \space \mbox{P}(X = a) = 0$. Median, quartiles and percentiles. Mean, variance and standard deviation. Knowledge of the formulae $\text{Var}(X) = \mbox{E}(X^2)-\big(\mbox{E}(X)\big)^2\space, \space \mbox{E}(aX + b) = a \mbox{E}(X) + b\space$ and $\text{Var}(aX + b) = a^2\text{Var}(X)$ will be expected. Mean, variance and standard deviation of a simple function of a continuous random variable. Eg $\space \mbox{E}(2X+3)\space ,\space \mbox{E}(5X^2) \space , \space \mbox{E}(10X^{-1}) \space, \space \mbox{E}(100X^{-2})$. Eg $\space \mathrm{Var}(3X)\space,\space \mathrm{Var}(4X-5)\space,\space \mathrm{Var}(6X^{-1})$. Rectangular distribution. Calculation of probabilities, proofs of mean, variance and standard deviation.
 Confidence intervals for the mean of a normal distribution with unknown variance. Using a   $t$ distribution. Only confidence intervals symmetrical about the mean will be required. Questions may involve a knowledge of confidence intervals from the module Statistics 1.
 Null and alternative hypotheses. The null hypothesis to be of the form that a parameter takes a specified value. One tailed and two tailed tests, significance level, critical value, critical region, acceptance region, test statistic, $\space \text{Type I}$ and $\text{Type II}\space$errors. The concepts of $\space \text{Type I errors}\space(\text{reject}\space\mbox{H}_0\space|\space\mbox{H}_0\space\text{true})$ and $\text{Type II errors}\space(\text{accept}\space\mbox{H}_0\space|\space\mbox{H}_0\space\text{false})$ should be understood but questions which require the calculation of the risk of a $\space \text{Type II error}$ will not be set. The significance level to be used in a hypothesis test will usually be given. Tests for the mean of a normal distribution with known variance. Using a   $z$-statistic. Tests for the mean of a normal distribution with unknown variance. Using a   $t$-statistic. Tests for the mean of a distribution using a normal approximation. Large samples only. Known and unknown variance.
 Introduction to   $\chi^2$ distribution. To include use of the supplied tables. Use of  $\displaystyle \sum \frac{(O_i - E_i)^2}{E_i}$ as an approximate  $\chi^2$-statistic. Conditions for approximation to be valid. The convention that all  $E_i$ should be greater than 5 will be expected. Test for independence in contingency tables. Use of Yates' correction for table will be required.