Table of Contents
Last modified on January 2nd, 2025
The coefficient of variation (CV) is a measurement in statistics that shows the relative dispersion of data points with respect to its mean.
Unlike standard deviation, which measures absolute variability, CV measures variability in decimal form or as a percentage. It is used to compare the variability of datasets with different units or scales, such as comparing financial returns or experimental results.
Mathematically, it is the ratio of the standard deviation to the mean.
The formula to determine the coefficient of variation depends on whether the dataset represents an entire group or just a subset (sample coefficient of variation).
If the dataset represents an entire group, it is called the population coefficient of variation. The formula to find the coefficient of variation for a population is:
CV = ${\dfrac{\sigma }{\mu }\times 100\%}$
Here,
If the dataset represents only a subset of an entire group, it is called the sample coefficient of variation. The formula to find it is:
CV = ${\dfrac{s}{\overline{x}}\times 100\%}$
Here,
Let us find the coefficient of variation for the dataset {4.5, 5.0, 4.8, 5.1, 4.9}
Finding the Mean (μ)
μ = ${\dfrac{4.5+5.0+4.8+5.1+4.9}{5}}$ = ${\dfrac{24.3}{5}}$ = 4.86
Finding the Population Standard Deviation (σ)
As we know, σ = ${\sqrt{\dfrac{\sum \left( x_{i}-\mu \right) ^{2}}{N}}}$
Here, N = 5
Now, calculating the squared deviations, we get
(4.5 – 4.86)2 = 0.1296
(5.0 – 4.86)2 = 0.0196
(4.8 – 4.86)2 = 0.0036
(5.1 – 4.86)2 = 0.0576
(4.9 – 4.86)2 = 0.0016
Adding the squared deviations, we get
${\sum \left( x_{i}-\mu \right) ^{2}}$ = 0.1296 + 0.0196 + 0.0036 + 0.0576 + 0.0016 = 0.212
Thus, σ = ${\sqrt{\dfrac{0.212}{5}}}$ = 0.206
Calculating the Population Coefficient of Variation
CV = ${\dfrac{\sigma }{\mu }\times 100\%}$
= ${\dfrac{0.206}{4.86}\times 100\%}$
= 4.24%
Thus, the Coefficient of Variation is 4.24%
Let us find the coefficient of variation for the sample {12, 14, 15, 11, 13, 12}
Finding the Mean (${\overline{x}}$)
${\overline{x}}$ = ${\dfrac{12+14+15+11+13+12}{6}}$ = ${\dfrac{77}{6}}$ = 12.83
Finding the Sample Standard Deviation (s)
As we know, s = ${\sqrt{\dfrac{\sum \left( x_{i}-\overline{x}\right) ^{2}}{n-1}}}$
Here, n = 6
Now, calculating the squared deviations, we get
(12 – 12.83)2 = 0.6889
(14 – 12.83)2 = 1.3689
(15 – 12.83)2 = 4.6889
(11 – 12.83)2 = 3.3489
(13 – 12.83)2 = 0.0289
(12 – 12.83)2 = 0.6889
Adding the squared deviations, we get
${\sum \left( x_{i}-\overline{x}\right) ^{2}}$ = 0.6889 + 1.3689 + 4.6889 + 3.3489 + 0.0289 + 0.6889 = 10.8134
Thus, s = ${\sqrt{\dfrac{10.8134}{6-1}}}$ = ${\sqrt{\dfrac{10.8134}{5}}}$ = 1.47
Calculating the Sample Coefficient of Variation
CV = ${\dfrac{s}{\overline{x}}\times 100\%}$
= ${\dfrac{1.47}{12.83}\times 100\%}$
= 11.45%
Thus, the Coefficient of Variation is 11.45%
Both the coefficient of variation and the standard deviation are commonly used to measure the dispersion of values in a dataset. Although both metrics measure variability, their purposes and applications differ significantly.
Basis | Coefficient of Variation (CV) | Standard Deviation (SD) |
---|---|---|
Type of Dispersion | Measures the relative variability as a percentage of the mean | Measures the absolute variability in a dataset |
Units | It is unitless | It has the same unit as the data |
Impact of Small or Zero Means | Can get affected | Does not get affected |
Uses | It is used to compare the variation of different datasets with different units/scales | It is used to measure the dispersion of data around the mean within a single dataset |
Find the population coefficient of variation for the dataset: {8, 10, 12, 14, 16}
As we know,
CV = ${\dfrac{\sigma }{\mu }\times 100\%}$
Here,
N = 5
μ = ${\dfrac{8+10+12+14+16}{5}}$ = ${\dfrac{60}{5}}$ = 12
Now, ${\sum \left( x_{i}-\mu \right) ^{2}}$
= (8 – 12)2 + (10 – 12)2 + (12 – 12)2 + (14 – 12)2 + (16 – 12)2
= 16 + 4 + 0 + 4 + 16
= 40
Thus, σ = ${\sqrt{\dfrac{40}{5}}}$ = ${\sqrt{8}}$ = 2.83
Here, CV = ${\dfrac{2.83}{12}\times 100\%}$ ≈ 23.58%
Thus, the coefficient of variation for the dataset is approximately 23.58%
If two samples have the following data:
Sample 1: {5, 10, 15, 20, 25}
Sample 2: {10, 20, 30, 40, 50}
Which sample has greater relative variability?
As we know,
CV = ${\dfrac{s}{\overline{x}}\times 100\%}$
For sample 1,
${\overline{x}}$ = ${\dfrac{5+10+15+20+25}{5}}$ = ${\dfrac{75}{5}}$ = 15
${\sum \left( x_{i}-\overline{x}\right) ^{2}}$ = (5 – 15)2 + (10 – 15)2 + (15 – 15)2 + (20 – 15)2 + (25 – 15)2 = 250
s = ${\sqrt{\dfrac{250}{5-1}}}$ = ${\sqrt{62.5}}$ = 7.91
Thus, CV1 = ${\dfrac{7.91}{15}\times 100\%}$ ≈ 52.73% …..(i)
For sample 2,
${\overline{x}}$ = ${\dfrac{10+20+30+40+50}{5}}$ = ${\dfrac{150}{5}}$ = 30
${\sum \left( x_{i}-\overline{x}\right) ^{2}}$ = (10 – 30)2 + (20 – 30)2 + (30 – 30)2 + (40 – 30)2 + (50 – 30)2 = 1000
s = ${\sqrt{\dfrac{1000}{5-1}}}$ = ${\sqrt{250}}$ = 15.81
Thus, CV2 = ${\dfrac{15.81}{30}\times 100\%}$ ≈ 52.7 …..(ii)
From (i) and (ii), we get
CV1 and CV2 are approximately equal
Thus, both samples have the same relative variability, with a CV of approximately 52.7%
Last modified on January 2nd, 2025