Variance vs Covariance
Variance and covariance are two measures used in statistics. Variance is a measure of the scatter of the data, and covariance indicates the degree of change of two random variables together. Variance is rather an intuitive concept, but covariance is defined mathematically in not that intuitive at first.
More about Variance
Variance is a measure of dispersion of the data from the mean value of the distribution. It tells how far the data points lie from the mean of the distribution. It is one of the primary descriptors of the probability distribution and one of the moments of the distribution. Also, variance is a parameter of the population, and the variance of a sample from the population act as an estimator for the variance of the population. From one perspective, it is defined as the square of the standard deviation.
In plain language, it can be described as the average of the squares of the distance between each data point and the mean of the distribution. Following formula is used to calculate the variance.
Var(X)=E[(X-µ)2 ] for a population, and
Var(X)=E[(X-‾x)2 ] for a sample
It can further be simplified to give Var(X)=E[X2 ]-(E[X])2.
Variance has some signature properties, and often used in statistics to make the usage simpler. Variance is non-negative because it is the square of the distances. However, the range of the variance is not confined and depends on the particular distribution. The variance of a constant random variable is zero, and the variance does not change with respect to a location parameter.
More about Covariance
In statistical theory, covariance is a measure of how much two random variables change together. In other words, covariance is a measure of the strength of the correlation between two random variables. Also, it can be considered as a generalization of the concept of variance of two random variables.
Covariance of two random variables X and Y, which are jointly distributed with finite second momentum, is known as σXY=E[(X-E[X])(Y-E[Y])]. From this, variance can be seen as a special case of covariance, where two variables are the same. Cov(X,X)=Var(X)
By normalizing the covariance, the linear correlation coefficient or the Pearson’s correlation coefficient can be obtained, which is defined as ρ=E[(X-E[X])(Y-E[Y])]/(σX σY )=( Cov(X,Y))/(σX σY)
Graphically, covariance between a pair of data points can be seen as the area of the rectangle with the data points at the opposite vertices. It can be interpreted as a measure of magnitude of separation between the two data points. Considering the rectangles for the whole population, the overlapping of the rectangles corresponding to all the data points can be considered as the strength of the separation; variance of the two variables. Covariance is in two dimensions, because of two variables, but simplifying it to one variable gives the variance of a single as the separation in one dimension.
What is the difference between Variance and Covariance?
• Variance is the measure of spread/ dispersion in a population while covariance is considered as a measure of variation of two random variables or the strength of the correlation.
• Variance can be considered as a special case of covariance.
• Variance and covariance are dependent on the magnitude of the data values, and cannot be compared; therefore, they are normalized. Covariance is normalized into the correlation coefficient (dividing by the product of the standard deviations of the two random variables) and variance is normalized into the standard deviation (by taking the square root)