Table of Contents

Last modified on January 2nd, 2025

chapter outline

Population and Sample Variance

Like standard deviation, we can also find variance for two different types of datasets based on the sample size. They are called population variance and sample variance.

We use population variance when we take all of the data in the dataset under consideration, whereas we use sample variance when we consider only a subset of the total population.

Let us now discuss them in detail.

Population Variance

Population variance measures the dispersion of data points across an entire population. It is represented by the Greek letter sigma squared (σ²)

In statistics, data can be ungrouped (raw) or grouped data (well-organized). We can calculate the variance for each type of data.

For Ungrouped Data

Mathematically, the formula to find the population variance is:

${\sigma ^{2}=\dfrac{\sum \left( x_{i}-\mu \right) ^{2}}{N}}$

Here,

N = Total number of observations
x_i = Individual data point
μ = Population mean

Steps To Find

Let us find the population variance for the data points X = {2, 4, 6, 8}

To find the population variance of this ungrouped data, we follow the following steps:

Finding the Mean

${\mu =\dfrac{2+4+6+8}{4}}$ = 5

Calculating (x_i – μ)²

(2 – 5)² = 9

(4 – 5)² = 1

(6 – 5)² = 1

(8 – 5)² = 9

Finding the Sum of Squares

${\sum \left( x_{i}-\mu \right) ^{2}}$ = 9 + 1 + 1 + 9 = 20

Dividing by N = 4

σ² = ${\sigma ^{2}=\dfrac{\sum \left( x_{i}-\mu \right) ^{2}}{N}}$ = ${\dfrac{20}{4}=5}$

Thus, the population variance is 5.

For Grouped Data

For grouped data, the population variance is calculated using the formula:

${\sigma ^{2}=\dfrac{\sum f\left( x_{i}-\mu \right) ^{2}}{n}}$

Here,

f = frequency of each interval
x_i = midpoint of the i^th interval
μ = population mean of the grouped data

Steps To Find

Let us calculate the population variance of the survey that records the ages of 35 individuals in a community.

Age Group (Years)	Frequency (f)
10 – 20	5
20 – 30	8
30 – 40	12
40 – 50	7
50 – 60	3

Finding the Midpoints

Age Group (Years)	Frequency (f)	Midpoint (x)
10 – 20	5	${\dfrac{10+20}{2}}$ = 15
20 – 30	8	${\dfrac{20+30}{2}}$ = 25
30 – 40	12	${\dfrac{30+40}{2}}$ = 35
40 – 50	7	${\dfrac{40+50}{2}}$ = 45
50 – 60	3	${\dfrac{50+60}{2}}$ = 55

Finding the Mean

The mean for grouped data is calculated as: ${\mu =\dfrac{\sum fx}{\sum f}}$

${\sum fx}$

= (5 × 15) + (8 × 25) + (12 × 35) + (7 × 45) + (3 × 55)

= 75 + 200 + 420 + 315 + 165

= 1175

${\sum f}$

= 5 + 8 + 12 + 7 + 3

= 35

Thus, μ = ${\dfrac{1175}{35}}$ = 33.57

Finding (x – μ)² and f(x – μ)²

x	f	x – μ	(x – μ)²	f(x – μ)²
15	5	15 – 33.57 = -18.57	344.93	5 × 344.93 = 1724.65
25	8	25 – 33.57 = -8.57	73.45	8 × 73.45 = 587.60
35	12	35 – 33.57 = 1.43	2.05	12 × 2.05 = 24.60
45	7	45 – 33.57 = 11.43	130.68	7 × 130.68 = 914.76
55	3	55 – 33.57 = 21.43	459.22	3 × 459.22 = 1377.66

Thus, ${\sum f\left( x-\mu \right) ^{2}}$ = 1724.65 + 587.60 + 24.60 + 914.76 + 1377.66 = 4628.27

Calculating Population Variance

${\sigma ^{2}=\dfrac{\sum f\left( x_{i}-\mu \right) ^{2}}{n}}$

= ${\dfrac{4628.27}{35}}$ = 132.24

Thus, the population variance of the grouped data is 132.24

Sample Variance

Sample variance measures variability when the data represents a subset (sample) of the total population. To avoid any biases, we use the correction factor, n – 1, known as Bessel’s correction.

The sample variance is represented by the letter s²

Note: This adjustment improves the accuracy by correcting the variability in small samples and making the sample variance a better approximation of the population variance.

For Ungrouped Data

Mathematically, sample variance can be obtained using the formula:

${s^{2}=\dfrac{\sum \left( x_{i}-\overline{x}\right) ^{2}}{n-1}}$

Here,

n = Total number of observations
x_i = Individual data point
${\overline{x}}$ = Sample mean

Steps To Find

Let us consider the data points X = {2, 4, 6, 8}

To find the sample variance, we follow the following steps:

Finding the Mean

${\overline{x}}$ = ${\dfrac{2+4+6+8}{4}}$ = 5

Calculating ${\left( x_{i}-\overline{x}\right) ^{2}}$

(2 – 5)² = 9

(4 – 5)² = 1

(6 – 5)² = 1

(8 – 5)² = 9

Finding the Sum of Squares

${\sum \left( x_{i}-\overline{x}\right) ^{2}}$ = 9 + 1 + 1 + 9 = 20

Dividing by n – 1

Here, n = 4 ⇒ n – 1 = 4 – 1 = 3

Now,

${s^{2}=\dfrac{\sum \left( x_{i}-\overline{x}\right) ^{2}}{n-1}}$ = ${\dfrac{20}{3}}$ ≈ 6.67

Thus, the sample variance is 6.67

For Grouped Data

Similarly, for grouped data, it is calculated by the formula:

${s^{2}=\dfrac{\sum f\left( x_{i}-\overline{x}\right) ^{2}}{n-1}}$

Here,

f = frequency of each interval
x_i = midpoint of the i^th interval
${\overline{x}}$ = sample mean of the grouped data

Steps To Find

Now, let us calculate the sample variance from the survey of the ages of 35 individuals in a community.

Age Group (Years)	Frequency (f)
10 – 20	5
20 – 30	8
30 – 40	12
40 – 50	7
50 – 60	3

Finding the Midpoints

We have:

Age Group (Years)	Frequency (f)	Midpoint (x)
10 – 20	5	15
20 – 30	8	25
30 – 40	12	35
40 – 50	7	45
50 – 60	3	55

Finding the Mean

${\sum fx}$

= (5 × 15) + (8 × 25) + (12 × 35) + (7 × 45) + (3 × 55)

= 1175

${\sum f}$

= 5 + 8 + 12 + 7 + 3

= 35

Thus, ${\overline{x}}$ = ${\dfrac{1175}{35}}$ = 33.57

Finding ${\left( x-\overline{x}\right) ^{2}}$ and ${f\left( x-\overline{x} \right) ^{2}}$

We have:

x	f	${x-\overline{x}}$	${\left( x-\overline{x}\right) ^{2}}$	${f\left( x-\overline{x} \right) ^{2}}$
15	5	-18.57	344.93	1724.65
25	8	-8.57	73.45	587.60
35	12	1.43	2.05	24.60
45	7	11.43	130.68	914.76
55	3	21.43	459.22	1377.66

Thus, ${\sum f\left( x-\overline{x} \right) ^{2}}$ = 4628.27

Calculating Sample Variance

${s^{2}=\dfrac{\sum f\left( x_{i}-\overline{x}\right) ^{2}}{n-1}}$

= ${\dfrac{4628.27}{35-1}}$

= ${\dfrac{4628.27}{34}}$ = 136.42

Thus, the sample variance of the grouped data is 136.42

Note: We observe that the sample variance is greater than the population variance for each dataset for both grouped and ungrouped data.

Properties

Non-Negativity

Variance is always non-negative because it is the average of squared deviations. Mathematically:

σ² ≥ 0
s² ≥ 0

Zero Variance

If all sample data points in a population or a sample are identical, the variance equals 0. It means

σ² = 0 when x_i = μ, ∀ i

s² = 0 when x_i = ${\overline{x}}$, ∀ i

Units

Variance is measured in squared units of the data. For example, if data is measured in meters, the population and sample variances are in square meters.

Adding a Constant

If a constant c is added to all data points, the population and sample variance remain unchanged. It means

Var(x_i + c) = Var(x_i)

Multiplying by a constant

If all data points are multiplied by a constant c, the population and sample variances are scaled by c²:

Var(c ⋅ x_i) = c² ⋅ Var(x_i)

Additivity

For independent random variables X and Y, the variance of their sum is:

Var(X + Y) = Var(X) + Var(Y)

Solved Examples

A company measures the heights (in cm) of 5 employees in a department. The data is as follows: 150, 160, 170, 180, 190. Find the population variance.

Solution:

Here,
Mean = μ = ${\dfrac{150+160+170+180+190}{5}}$ = 170
The square differences = (x_i – μ)²
(150 – 170)² = (-20)² = 400
(160 – 170)² = (-10)² = 100
(170 – 170)² = (0)² = 0
(180 – 170)² = (10)² = 100
(190 – 170)² = (20)² = 400
The sum of the square differences = ${\sum \left( x_{i}-\mu \right) ^{2}}$ = 400 + 100 + 0 + 100 + 400 = 1000
As we know, the population variance is
σ² = ${\sigma ^{2}=\dfrac{\sum \left( x_{i}-\mu \right) ^{2}}{N}}$
Now, by using the formula, we get
= ${\dfrac{1000}{5}}$ = 200
Thus, the population variance is 200 cm².

A researcher randomly selects 4 students’ scores from a class: 12, 14, 16, 18. Find the sample variance.

Solution:

Here,
Mean = ${\overline{x}}$ = ${\dfrac{12+14+16+18}{4}}$ = 15
The square differences = ${\left( x_{i}-\overline{x}\right) ^{2}}$
(12 – 15)² = (-3)² = 9
(14 – 15)² = (-1)² = 1
(16 – 15)² = (1)² = 1
(18 – 15)² = (3)² = 9
The sum of the square differences = ${\sum \left( x_{i}-\overline{x}\right) ^{2}}$ = 9 + 1 + 1 + 9 = 20
As we know, the sample variance is
${s^{2}=\dfrac{\sum \left( x_{i}-\overline{x}\right) ^{2}}{n-1}}$
Now, by using the formula, we get
= ${\dfrac{20}{4-1}}$ ≈ 6.67
Thus, the sample variance is 6.67.

Last modified on January 2nd, 2025

chapter outline

Population and Sample Variance

Population Variance

For Ungrouped Data

Steps To Find

For Grouped Data

Steps To Find

Sample Variance

For Ungrouped Data

Steps To Find

For Grouped Data

Steps To Find

Properties

Non-Negativity

Zero Variance

Units

Adding a Constant

Multiplying by a constant

Additivity

Solved Examples

Categories

Grades

Join Our Newsletter

#ezw_tco-2 .ez-toc-title{ font-size: 120%; ; ; } #ezw_tco-2 .ez-toc-widget-container ul.ez-toc-list li.active{ background-color: #ededed; } chapter outline

Population and Sample Variance

Population Variance

For Ungrouped Data

Steps To Find

For Grouped Data

Steps To Find

Sample Variance

For Ungrouped Data

Steps To Find

For Grouped Data

Steps To Find

Properties

Non-Negativity

Zero Variance

Units

Adding a Constant

Multiplying by a constant

Additivity

Solved Examples

Categories

Grades

Join Our Newsletter

chapter outline