Last modified on December 20th, 2024

chapter outline

 

Population and Sample Standard Deviation

While finding the standard deviation of a data set, we may encounter two different sets of data based on its scope and size. The two types are population standard deviation and sample standard deviation.

Let us now discuss them in detail.  

Population Standard Deviation 

The population standard deviation represents the entire population of an area under consideration, such as a national census or during a financial report. Thus, it includes all individuals in a population.

The population standard deviation is generally represented by the Greek letter ‘σ

In statistics, data can be ungrouped (raw) or grouped data (well-organized). We can calculate the standard deviation for each type of data.

For Ungrouped Data

We calculate the population standard deviation of a dataset using the formula:  

${\sigma =\sqrt{\dfrac{\sum \left( x_{i}-\mu \right) ^{2}}{N}}}$ 

Here,

  • N = Total number of data points in the population
  • xi = Individual data point
  • μ = Population mean

Steps To Find

Let us find the population standard deviation for the dataset 5, 8, 10, 12, and 15

Finding the mean (μ)

As we know, 

Mean = (Sum of data points) ÷ (Total data points)

= ${\dfrac{5+8+10+12+15}{5}}$ = 10

Calculating the Squared Differences ((xi – μ)2)

(5 – 10)2 = 25

(8 – 10)2 = 4

(10 – 10)2 = 0

(12 – 10)2 = 4

(15 – 10)2 = 25

Here, the sum of the squared differences is: 

${\sum \left( x_{i}-\mu \right) ^{2}}$ = 25 + 4 + 0 + 4 + 25 = 58

Using the Formula 

Thus, the population standard deviation is ${\sigma =\sqrt{\dfrac{\sum \left( x_{i}-\mu \right) ^{2}}{N}}}$

= ${\sqrt{\dfrac{58}{5}}}$ = ${\sqrt{11.6}}$ ≈ 3.41

Thus, the population standard deviation of the ungrouped data is 3.41

For Grouped Data

The population standard deviation (σ) formula for grouped data is calculated as:

${\sigma =\sqrt{\dfrac{\sum f\left( x_{i}-\mu \right) ^{2}}{N}}}$

Here,

  • f = frequency of each interval
  • xi = midpoint of the ith interval
  • μ = population mean of the grouped data

Steps To Find

Let us calculate the population standard deviation of the survey that records the test scores of 50 students.

Score IntervalFrequency (f)
0 – 105
10 – 208
20 – 3012
30 – 4015
40 – 5010

Finding the Midpoints

Score IntervalFrequency (f)Midpoint (x)
0 – 105${\dfrac{0+10}{2}}$ = 5
10 – 208${\dfrac{10+20}{2}}$ = 15
20 – 3012${\dfrac{20+30}{2}}$ = 25
30 – 4015${\dfrac{30+40}{2}}$ = 35
40 – 5010${\dfrac{40+50}{2}}$ = 45

Finding the Mean

The mean for grouped data is calculated as: ${\mu =\dfrac{\sum fx}{\sum f}}$

${\sum fx}$ 

= (5 × 5) + (8 × 15) + (12 × 25) + (15 × 35) + (10 × 45) 

= 25 + 120 + 300 + 525 + 450 

= 1420

${\sum f}$

= 5 + 8 + 12 + 15 + 10

= 50

Thus, μ = ${\dfrac{1420}{50}}$ = 28.4

Finding (x – μ)2 and f(x – μ)2

xfx – μ(x – μ)2f(x – μ)2
555 – 28.4 = -23.4547.565 × 547.56 = 2737.8
15815 – 28.4 = -13.4179.568 × 179.56 = 1436.48
251225 – 28.4 = -3.411.5612 × 11.56 = 138.72
351535 – 28.4 = 6.643.5615 × 43.56 = 653.4
451045 – 28.4 = 16.6275.5610 × 275.56 = 2755.6

Thus, ${\sum f\left( x-\mu \right) ^{2}}$ = 2737.8 + 1436.48 + 138.72 + 653.4 + 2755.6 = 7721.6

Calculating Population Standard Deviation

${\sigma =\sqrt{\dfrac{\sum f\left( x_{i}-\mu \right) ^{2}}{N}}}$

= ${\sqrt{\dfrac{7721.6}{50}}}$

= ${\sqrt{154.43}}$ ≈ 12.43

Thus, the population standard deviation of the grouped data is 12.43

Sample Standard Deviation

The sample standard deviation estimates the standard deviation of a dataset, which is a subset of the population. It accounts for the potential bias in smaller datasets by using n – 1 (Bessel’s correction) instead of the total number n

It is commonly used in research and surveys, such as medical trials and market research, where only a subset of the population is analyzed.

The sample standard deviation is generally represented by the letter ‘s

For Ungrouped Data

We calculate the sample standard deviation of a population data set using the formula: 

${s=\sqrt{\dfrac{\sum \left( x_{i}-\overline{x}\right) ^{2}}{n-1}}}$

Here,

  • n = Total number of data points in the sample.
  • xi = Individual data point.
  • ${\overline{x}}$ = Sample mean

Note: Bessel’s correction ensures the sample standard deviation is an unbiased approximation of the population standard deviation. Without this correction, the variability in smaller samples cannot be estimated properly.

Steps To Find 

Let us find the population standard deviation for the dataset 7, 9, 12, 13, and 16 

Finding the mean (${\overline{x}}$)

As we know, 

Mean = (Sum of data points) ÷ (Total data points)

= ${\dfrac{7+9+12+13+16}{5}}$ = 11.4

Calculating the Squared Differences (${\left( x_{i}-\overline{x}\right) ^{2}}$)

(7 – 11.4)2 = (-4.4)2 = 19.36

(9 – 11.4)2 = (-2.4)2 = 5.76

(12 – 11.4)2 = (0.6)2 = 0.36

(13 – 11.4)2 = (1.6)2 = 2.56

(16 – 11.4)2 = (4.6)2 = 21.16

Here, the sum of the squared differences is: 

${\sum \left( x_{i}-\overline{x}\right) ^{2}}$ = 19.36 + 5.76 + 0.36 + 2.56 + 21.16 = 49.2

Using the Formula 

As we know, the sample standard deviation is given by the formula:

 ${s=\sqrt{\dfrac{\sum \left( x_{i}-\overline{x}\right) ^{2}}{n-1}}}$

= ${\sqrt{\dfrac{49.2}{5-1}}}$ = ${\sqrt{\dfrac{49.2}{4}}}$ = ${\sqrt{12.3}}$ ≈ 3.51

Thus, the sample standard deviation of the ungrouped data is 3.51

For Grouped Data

Similarly, for grouped data, it is calculated by using the formula:

${s=\sqrt{\dfrac{\sum f\left( x_{i}-\overline{x}\right) ^{2}}{n-1}}}$

Here

  • f = frequency of each interval
  • xi = midpoint of the ith interval
  • ${\overline{x}}$ = sample mean of the grouped data

Steps To Find

Now, let us calculate the sample standard deviation from the survey recording the test scores of 50 students.

Score IntervalFrequency (f)
0 – 105
10 – 208
20 – 3012
30 – 4015
40 – 5010

Finding the Midpoints

We have:

Score IntervalFrequency (f)Midpoint (x)
0 – 1055
10 – 20815
20 – 301225
30 – 401535
40 – 501045

Finding the Mean

${\sum fx}$ 

= (5 × 5) + (8 × 15) + (12 × 25) + (15 × 35) + (10 × 45) 

= 1420

${\sum f}$

= 5 + 8 + 12 + 15 + 10

= 50

Thus, ${\overline{x}}$ = ${\dfrac{1420}{50}}$ = 28.4

Finding ${\left( x-\overline{x}\right) ^{2}}$ and ${f\left( x-\overline{x} \right) ^{2}}$ 

We have:

xf${x-\overline{x}}$${\left( x-\overline{x}\right) ^{2}}$${f\left( x-\overline{x} \right) ^{2}}$
55-23.4547.562737.8
158-13.4179.561436.48
2512-3.411.56138.72
35156.643.56653.4
451016.6275.562755.6

Thus, ${\sum f\left( x-\overline{x} \right) ^{2}}$ = 7721.6

Calculating Sample Standard Deviation

${s=\sqrt{\dfrac{\sum f\left( x_{i}-\overline{x}\right) ^{2}}{n-1}}}$

= ${\sqrt{\dfrac{7721.6}{50-1}}}$

= ${\sqrt{\dfrac{7721.6}{49}}}$

= ${\sqrt{157.58}}$ ≈ 12.55

Thus, the sample standard deviation of the grouped data is 12.55

Below is a summary of the formulas for finding the population and sample standard deviation of a dataset:

Solved Examples

The monthly sales (in thousands of dollars) for a small business over 5 months are: 12, 15, 20, 25, 30. Find the sample standard deviation.

Solution:

Here, 
The sample mean is ${\overline{x}}$ = ${\dfrac{12+15+20+25+30}{5}}$ = 20.4
The squared differences are:
(12 – 20.4)2 = (-8.4)2 = 70.56
(15 – 20.4)2 = (-5.4)2 = 29.16
(20 – 20.4)2 = (-0.4)2 = 0.16
(25 – 20.4)2 = (4.6)2 = 21.16
(30 – 20.4)2 = (9.6)2 = 92.16
The sum of the squared differences is ${\sum \left( x_{i}-\overline{x}\right) ^{2}}$ = 70.56 + 29.16 + 0.16 + 21.16 + 92.16 = 213.2
Now, the sample standard deviation is ${s=\sqrt{\dfrac{\sum \left( x_{i}-\overline{x}\right) ^{2}}{n-1}}}$
= ${\sqrt{\dfrac{213.2}{5-1}}}$ = ${\sqrt{\dfrac{213.2}{4}}}$ = ${\sqrt{53.3}}$ ≈ 7.3
Thus, the sample standard deviation is 7.3 thousand dollars.

The test scores for a group of students are: 85, 90, 88, 92, 87, 89, 91, 86. Calculate the population standard deviation.

Solution:

Here, 
The population mean is μ = ${\dfrac{85+90+88+92+87+89+91+86}{8}}$ = 88.5
The squared differences are:
(85 – 88.5)2 = (-3.5)2 = 12.25
(90 – 88.5)2 = (1.5)2 = 2.25
(88 – 88.5)2 = (-0.5)2 = 0.25
(92 – 88.5)2 = (3.5)2 = 12.25
(87 – 88.5)2 = (-1.5)2 = 2.25
(89 – 88.5)2 = (0.5)2 = 0.25
(91 – 88.5)2 = (2.5)2 = 6.25
(86 – 88.5)2 = (-2.5)2 = 6.25
The sum of the squared differences is ${\sum \left( x_{i}-\mu \right) ^{2}}$ = 12.25 + 2.25 + 0.25 + 12.25 + 2.25 + 0.25 + 6.25 + 6.25 = 42
Now, the sample standard deviation is ${\sigma =\sqrt{\dfrac{\sum \left( x_{i}-\mu \right) ^{2}}{N}}}$
= ${\sqrt{\dfrac{42}{8}}}$ = ${\sqrt{5.25}}$ ≈ 2.29
Thus, the population standard deviation is 2.29.

Last modified on December 20th, 2024