Someone recently bought our

students are currently browsing our notes.


St107 Notes

Mathematics Notes > Quantitative Methods - Statistics (ST107) Notes

This is an extract of our St107 document, which we sell as part of our Quantitative Methods - Statistics (ST107) Notes collection written by the top tier of London School Of Economics students.

The following is a more accessble plain text extract of the PDF sample above, taken from our Quantitative Methods - Statistics (ST107) Notes. Due to the challenges of extracting text from PDFs, it will have odd formatting:


SECTION A: QUESTION 1 - SHORT QUESTIONS Axioms of Probability: 1) For any event A, 0 [?] P(A) [?] 1 2) For sample space S, P(S) = 1 3) If {Ai}, i = 1,..., n are mutually exclusive events then the probability of their union is the sum of their respective probabilities Relationship Between SD & Variance: SD = [?]Variance (SD2 = Variance) Expectations of Linear Combinations of Random Variables: E(T) = E(aX+bY) = aE(X) + bE(Y) Var(T) = Var(aX+bY) = a2Var(x) + b2Var(Y) Finding Mean & Variance of Transformed Distributions:

* Mean: Affected by all operations (additions/subtraction/multiplication/division)

* Variance: Only affected by changes in scale (multiplication/division), must square scale changes ---> i.e. if original variance is 6 and all values divided by 2, new variance = 6/22 Conditions for PDF: 1) fx(x) [?] 0 for all values of x 2) [?]fx(x) dx = 1 Transformation Formula For Standardisation

* Z = (X-u)/s when X ~ N(u, s2) ---> remember to square root SD to find variance!
Poisson Approximation to Normal

* Approximate X ~ Bin(n, p) with Y ~ Pois(np) Normal Approximation to Binomial/Poisson:

* Approximate X ~ Bin(n, p) with Y ~ N(np, np(1-p))

* Approximate X ~ Pois(l) with Y ~ N(l,l)

* Continuity Correction: Used when transitioning from a discrete to a continuous distribution ---> either add or subtract 0.5 of a unit from each discrete x-value
- If P(X = n) use P(n - 0.5 < X < n + 0.5)
- If P(X > n) use P(X > n + 0.5)
- If P(X [?] n) use P(X > n - 0.5)
- If P (X < n) use P(X < n - 0.5)
- If P(X [?] n) use P(X < n + 0.5) Normal Approximation of Student T Distribution:

* Only used for large samples ---> works because, as n - [?], tn - N(0,1) Central Limit Theorem: When sampling from almost any non-normal populations, as n -
[?], x ~ N(u, s2/n). Mean Square Error: Provides a mechanism for evaluating the relative performance of different estimators taking into account bias and variance Ways of Reducing Confidence Level: 1) Reduce level of confidence 2) Increase sample size


P-Value: The probability of observing the test statistic value or a more extreme value conditional on H0 being true Estimators

* Estimator: Statistic used to estimate an unknown population parameter

* Estimate: The numerical realisation of the estimator using observed data. Degrees of Freedom: (r-1)(c-1) Reject H0 when:

* P-Value < Significance Level ---> More informative measure of statistical significance

* |Test Statistic| > Critical Value When choosing significance level start with a = 0.05, if H0 is rejected try a = 0.01, if not try
a = 0.10 ---> significance at 5% automatically means significance at10%
Types of Error:

* Type 1 Error: Rejecting H0 when it is true (false positive)

* Type 2 Error: Failing to reject H0 when it is false (false negative) Linear Regression Model

* Correlation: Measures strength of a linear relationship
- Absence of correlation does not imply an independent relationship as it only means that no linear relationship is present, independence indicates that there is also no non-linear relationship

* Regression: Way of representing that linear relationship

* Coefficient of Determination (R2): R2 = r2 (for simple linear model) where r = sample correlation coefficient

* CoVariance: g = (r)(sX)(sY) = (Correlation between X & Y)(Variance of X)(Variance of Y)

* Estimate of the slope parameter (b) cannot be negative if the sample correlation coefficient is positive as the sign of each statistic is determined by the sign of Sxy, the corrected sum of cross-products QUESTION 2 - PDFs & CDFs Integrate PDF to get CDF [?]fx(x)dx = Fx(x) Conditions to verify PDF: 1) fx(x) [?] 0 for all values of x 2) [?]fx(x) dx = 1 To find:

* Value of Constant: Use condition 2 and solve for constant (use range of fx(x) as limits)

* Mean of X E[X] = [?]xfx(x)dx

* Variance of X s2x = [?]x2fx(x)dx - u2

* Standard Deviation of X sx = [?]Variance

* CDF from PDF = [?]fx(x)dx
- Defined as integral from range of PDF

* Median: [?]fx(x)dx = 0.5

* Mode: Value at which fx(x) achieves a maximum

* Conditional Probability: Use Bayes' formula ---> intersection generally probability of larger function, calculate 1 - Fx(value given) for numerator and denominator, simplify Make sure to check limits!!!

Buy the full version of these notes or essay plans and more in our Quantitative Methods - Statistics (ST107) Notes.