Last modified on April 16th, 2024

chapter outline

 

Cauchy Schwarz Inequality

Cauchy-Schwarz inequality, also known as Cauchy-Bunyakovsky-Schwarz inequality, states that the absolute value of the inner product of two vectors is less than or equal to the product of their magnitudes.

Let ‘V’ be an inner product space over the field ‘F’ of real (ℝ) or complex numbers (ℂ) with the inner product ⟨⋅, ⋅⟩

Then, for every pair of vectors x, y Є V, the inequality can be written as:

|⟨x, y⟩|2 ≤ ⟨x, x⟩ ⋅ ⟨y, y⟩

Equivalently, |⟨x, y⟩| ≤ ||x|| ⋅ ||y||

Here, the equalities |⟨x, y⟩| = ||x|| ⋅ ||y|| hold if and only if the two vectors, ‘x’ and ‘y’, are linearly dependent, that is, y = ax, for some a Є F.

Cauchy-Schwarz inequality is used to prove triangle inequality and AM-GM inequality.

Proof

Let ‘V’ be a vector space over the real or complex field ‘F’ and x, y Є V

To prove the inequality, we will first prove that |⟨x, y⟩|2 = ⟨x, x⟩ ⋅ ⟨y, y⟩, if y = ax, for some a Є F

Let us consider y = ax, for some a Є F

From the properties of inner product space, we get

${\left| \langle x,y\rangle \right| ^{2}}$

= ${\left| \langle x,ax\rangle \right| ^{2}}$

= ${\left| \overline{a}\langle x,x\rangle \right| ^{2}}$

= ${\left| \overline{a}\right| ^{2}\left| \langle x,x\rangle \right| ^{2}}$

= ${\left| a\right| ^{2}\langle x,x\rangle ^{2}}$

Now, ${\langle x,x\rangle \cdot \langle y,y\rangle}$

= ${\langle x,x\rangle \cdot \langle ax,ax\rangle}$

= ${\langle x,x\rangle a\overline{a}\langle x,x\rangle}$

= ${\left| a\right| ^{2}\langle x,x\rangle ^{2}}$

Thus, if y = ax, for some a Є F, then |⟨x, y⟩|2 = ⟨x, x⟩ ⋅ ⟨y, y⟩

Now, we will prove |⟨x, y⟩|2 < ⟨x, x⟩ ⋅ ⟨y, y⟩, if y ≠ ax, for some a Є F

Now, let us consider y ≠ ax, for some a Є F

This implies y ≠ 0 and ⟨y, y⟩ ≠ 0

By the properties of inner product space, ⟨x – ay, x – ay⟩ > 0, for all a Є F

Now, ${\langle x-ay,x-ay\rangle}$

= ${\langle x,x-ay\rangle -a\langle y,x-ay\rangle}$

= ${\langle x,x\rangle -\overline{a}\langle x,y\rangle -a\langle y,x\rangle +\left| a\right| ^{2}\langle y,y\rangle}$

By choosing a = ${\dfrac{\langle x,y\rangle }{\langle y,y\rangle }}$, we get

${\langle x,x-ay\rangle -a\langle y,x-ay\rangle}$

= ${\langle x,x\rangle -\dfrac{\langle y,x\rangle }{\langle y,y\rangle }\langle x,y\rangle -\dfrac{\langle x,y\rangle }{\langle y,y\rangle }\langle y,x\rangle +\dfrac{\langle x,y\rangle \langle y,x\rangle }{\langle y,y\rangle ^{2}}\langle y,y\rangle}$

= ${\langle x,x\rangle -\dfrac{\langle x,y\rangle \langle y,x\rangle }{\langle y,y\rangle }}$

Since ⟨x – ay, x – ay⟩ > 0

⇒ ${\langle x,x\rangle -\dfrac{\langle x,y\rangle \langle y,x\rangle }{\langle y,y\rangle }}$ > 0

⇒ ${\left( x,x\right) \langle y,y\rangle -\langle x,y\rangle \overline{\langle x,y\rangle } >0}$

⇒ ${\left( x,x\right) \langle y,y\rangle >\left| \langle x,y\rangle \right| ^{2}}$

We conclude that if y ≠ ax, for some a Є F, then |⟨x, y⟩|2 < ⟨x, x⟩ ⋅ ⟨y, y⟩. This implies

${\left| \langle x,y\rangle \right| =\sqrt{\langle x,x\rangle \cdot \langle y,y\rangle }}$

⇒ ${\left| \langle x,y\rangle \right| =\sqrt{\langle x,x\rangle }\cdot \sqrt{\langle y,y\rangle }}$

⇒ |⟨x, y⟩| ≤ ||x|| ⋅ ||y||

Thus, the vector form of the Cauchy-Schwarz inequality is proved.

Alternative Forms

Over time, the Cauchy-Schwarz inequality has been expressed in various ways:

Integral Form

Statement

If ‘a’ and ‘b’ are real numbers with a < b, and two functions ‘f and ‘g’ are integrable on [a, b], then ${\left| \int ^{b}_{a}f\left( x\right) g\left( x\right) dx\right| ^{2}\leq \left[ \int ^{b}_{a}\left( f\left( x\right) \right) ^{2}dx\right] \left[ \int ^{b}_{a}\left( g\left( x\right) \right) ^{2}dx\right]}$, where the equality holds if f(x) is proportional to g(x) almost everywhere on [a, b]

Proof

Let ‘a’ and ‘b’ be real numbers, with a < b, and f, g: [a, b] → ℝ be two functions. 

We assume neither ‘f’ nor ‘g’ is identically zero on [a, b] and ‘f,’ ‘g’ are continuous on [a, b]. 

Now, let us define the function F: ℝ → ℝ by F(t) = ${\int ^{b}_{a}\left( tf\left( x\right) +g\left( x\right) \right) ^{2}dx}$ for any t Є ℝ. 

For any x Є [a, b] and t Є ℝ, we have (t f(x) + g(x))2 ≥ 0

⇒ ${F\left( t\right) =\int ^{b}_{a}\left( tf\left( x\right) +g\left( x\right) \right) ^{2}dx\geq 0}$

Now, considering A = ${\int ^{b}_{a}\left( f\left( x\right) \right) ^{2}dx}$, B = ${2\int ^{b}_{a}f\left( x\right) g\left( x\right) dx}$, and C = ${\int ^{b}_{a}\left( g\left( x\right) \right) ^{2}dx}$

By definition, for any t Є ℝ, we have

${F\left( t\right) =\int ^{b}_{a}\left( tf\left( x\right) +g\left( x\right) \right) ^{2}dx=\ldots =t^{2}\int ^{b}_{a}\left( f\left( x\right) \right) ^{2}dx+2t\int ^{b}_{a}f\left( x\right) g\left( x\right) dx+\int ^{b}_{a}\left( g\left( x\right) \right) ^{2}dx}$

= At2 + Bt + C

Since (f(x))2 ≥ 0 for any x Є [a, b], and ‘f’ is not a constant zero function on [a, b], then

A = ${\int ^{b}_{a}\left( f\left( x\right) \right) ^{2}dx >0}$

Thus, ‘F’ is a quadratic polynomial function with real coefficients, and the discriminant of F is ∆ = B2 – 4AC. 

Since F(t) ≥ 0 for any t Є ℝ, then 

∆ ≤ 0

⇒ ${B^{2}-4AC\leq 0}$

⇒ ${B^{2}\leq 4AC}$

⇒ ${\dfrac{B^{2}}{4}\leq AC}$

Hence, ${\left| \int ^{b}_{a}f\left( x\right) g\left( x\right) dx\right| ^{2}=\dfrac{B^{2}}{4}\leq AC=\left[ \int ^{b}_{a}\left( f\left( x\right) \right) ^{2}dx\right] \left[ \int ^{b}_{a}\left( g\left( x\right) \right) ^{2}dx\right]}$

⇒ ${\left| \int ^{b}_{a}f\left( x\right) g\left( x\right) dx\right| ^{2}\leq \left[ \int ^{b}_{a}\left( f\left( x\right) \right) ^{2}dx\right] \left[ \int ^{b}_{a}\left( g\left( x\right) \right) ^{2}dx\right]}$, Cauchy Schwarz inequality is proved for integrals.

Summation Form

Statement

If ‘x1‘, ‘x2‘, …, ‘xn‘ and ‘y1‘, ‘y2‘, …, ‘yn‘ are real numbers, then ${\left( \sum ^{n}_{i=1}x_{i}y_{i}\right) ^{2}\leq \sum ^{n}_{i=1}x_{i}^{2}\sum ^{n}_{i=1}y_{i}^{2}}$ where the equality holds if a constant λ exists such that xk = λyk for each k Є {1, 2, …, n}.

Proof

Here, we will use the method of mathematical induction.

For n = 1: The case is trivial.

For n = 2: If x1, x2, …, xn and y1, y2, …, yn are real numbers, then

(x1y2 – x2y1)2 ≥ 0

⇒ (x1y2)2 + (x2y1)2 – 2(x1y2x2y1) ≥ 0

Again, by rearranging both sides,

(x1y1)2 + (x1y2)2 + (x2y1)2 + (x2y2)2 ≥ (x1y1)2 – 2(x1y2x2y1) + (x2y2)2 

⇒ [(x1)2 + (x2)2] [(y1)2 + (y2)2] ≥ (x1y1 – x2y2)2

By taking the square roots of both sides,

${\sqrt{x_{1}^{2}+x_{2}^{2}}\sqrt{y_{1}^{2}+y_{2}^{2}}\geq \left| x_{1}y_{1}+x_{1}y_{2}\right|}$

⇒ ${\left| x_{1}y_{1}+x_{2}y_{2}\right| \leq \sqrt{x_{1}^{2}+x_{2}^{2}}\sqrt{y_{1}^{2}+y_{2}^{2}}}$, which proves the inequality for n = 2 …..(i)

Now, let us assume that the inequality (1) holds for n = k, where k is any arbitrary integer.

${\left( \sum ^{k}_{i=1}x_{i}y_{i}\right) ^{2}\leq \sum ^{k}_{i=1}x_{i}^{2}\sum ^{k}_{i=1}y_{i}^{2}}$ …..(ii)

For n = k + 1: We have

${\sqrt{\sum ^{k+1}_{i=1}x_{i}^{2}}\cdot \sqrt{\sum ^{n+1}_{i=1}y_{i}^{2}}=\sqrt{\sum ^{k}_{i=1}x_{i}^{2}+x_{k+1}^{2}}\cdot \sqrt{\sum ^{k}_{i=1}y_{i}^{2}+y_{k+1}^{2}}}$ …..(iii)

By comparing the R-H-S of the equation (iii) with the R-H-S of the inequality (ii),

${\sqrt{\sum ^{k}_{i=1}x_{i}^{2}+x_{k+1}^{2}}\cdot \sqrt{\sum ^{k}_{i=1}y_{i}^{2}+y_{k+1}^{2}}\geq \sqrt{\sum ^{k}_{i=1}x_{i}^{2}}\sqrt{\sum ^{k}_{i=1}y_{i}^{2}}+\left| x_{k+1}y_{k+1}\right|}$

Assuming that the inequality (1) holds for n = k,

${\sqrt{\sum ^{k}_{i=1}x_{i}^{2}}\sqrt{\sum ^{k}_{i=1}y_{i}^{2}}+\left| x_{k+1}y_{k+1}\right| \geq \sum ^{k}_{i=1}x_{i}y_{i}+\left| x_{k+1}y_{k+1}\right| \geq \sum ^{k+1}_{i=1}x_{i}y_{i}}$

Thus, the Cauchy-Schwarz inequality is proved for real numbers.

Probability Form

Statement

If X and Y are two random variables, then ${\left| E\left( XY\right) \right| \leq \sqrt{E\left( X^{2}\right) E\left( Y^{2}\right) }}$, where the equality holds if X = ɑY, for some constant ɑ Є ℝ

Proof

Let us consider the random variable V = (X – ɑY)2, where ‘V’ is a positive random variable for all ɑ Є ℝ

Thus,

0 ≤ E(V)

⇒ 0 ≤ E(X – ɑY)2

⇒ 0 ≤ E(X2 – 2ɑXY + ɑ2Y2)

⇒ 0 ≤ E(X2) – 2ɑ E(XY) + ɑ2 E(Y2)

Now, if the function f(ɑ) = E(X2) – 2ɑ E(XY) + ɑ2 E(Y2), then f (ɑ) ≥ 0, for all ɑ Є ℝ

Now, if f(ɑ) = 0, for some ɑ, then E(V) = E(X – ɑY)2 = 0, which means X = ɑY

By choosing ${\alpha =\dfrac{E\left( XY\right) }{E\left( Y^{2}\right) }}$, we get

0 ≤ E(X2) – 2ɑ E(XY) + ɑ2 E(Y2)

⇒ 0 ≤ ${E\left( X^{2}\right) -2\dfrac{E\left( XY\right) }{E\left( Y^{2}\right) }E\left( XY\right) +\dfrac{\left( E\left( XY\right) \right) ^{2}}{\left( E\left( Y\right) ^{2}\right) ^{2}}E\left( Y^{2}\right)}$

⇒ 0 ≤ ${E\left( X^{2}\right) -\dfrac{\left( E\left( XY\right) \right) ^{2}}{E\left( Y\right) ^{2}}}$

⇒ ${\left( E\left( XY\right) \right) ^{2}\leq E\left( X^{2}\right) E\left( Y^{2}\right)}$

⇒ ${\left| E\left( XY\right) \right| \leq \sqrt{E\left( X^{2}\right) E\left( Y^{2}\right) }}$

Thus, the Cauchy-Schwarz theorem is proved for the expected values.