Skip to the content.

Statistics Primer- Recalling all the frequently used terms

This post is just to keep as a ready reference mterial for some statistical terms which come up a lot in maching learning. This will be expanded from time to time to keep up the relevant data anad material.

Random Variable – These are not the traditional variables that you were first exposed to in your algebra class. They are outcomes of a random process. For example, the number of visitors of a restaurant on a particular day.

Expected Value – The expected value is the probability-weighted-average of a random variable. Intuitively, it is the average value of the outcomes of the experiment it represents. For example, the expected value of a dice throw is 3.5.

Variance – The measure of spread of the probability distribution of a random variable. It determines the degree to which the values of a random variable differ from the expected value. The square root of variance is standard deviation.

$Var(X) =E[(X – E[X])^2]$, where X is a random variable and E[X] is its expectation.

Covariance – The measure of the joint variability of two random variables. It depends on the magnitude of the variables.

Cov(X, Y) = E[ (X – E[X]) (Y – E[Y]) ], where X and Y are random variables.

Autocovariance – The covariance of a random variable with itself at different points of time. For example, Cov(Xt , Xt-h) . Please note that here, Xt and Xt-h are also random variables.

Correlation – It is the scaled form of covariance. It is dimensionless.

$Corr(X, Y) = \frac{Cov(X, Y)}{(sd(X) * sd(Y)}$,

where sd(X) is the square root of variance (standard deviation) of X.

Autocorrelation – Similar to autocovariance, it is the correlation of a random variable with itself at different points of time. For example, Corr(Xt , Xt-h)

White Noise – A collection of uncorrelated random variables, with mean 0 and a finite variance.

Khan Acdemy Reference Material

Written on February 23, 2018