Table of Contents
Last modified on November 23rd, 2024
Polynomial regression is a statistical method to analyze and model the relationship between two variables, a dependent variable (y) and an independent variable (x) when the data exhibits a curved pattern. Unlike linear regression, which fits a straight line to data, polynomial regression fits a polynomial equation to capture the underlying trends effectively.
The general equation of a polynomial regression model is:
y = a0 + a1x + a2x2 + … + anxn + ϵ
Here,
Difference with Linear Regression
Although the equation is polynomial in x, the regression remains linear with respect to the coefficients a0, a1, …, an. By incorporating higher-degree terms, such as quadratic or cubic components, the model can show non-linear patterns in the data.
Polynomial regression is useful when the data follows a curved pattern, where simple linear regression fails to capture the trend accurately. It predicts a curve that matches the underlying data.
When representing the polynomial regression model mathematically, we first determine the coefficients.
It is done by minimizing the sum of squared errors (SSE), which is given by:
SSE = ${\sum ^{m}_{i=1}\left( y_{i}-\left( a_{0}+a_{1}x_{i}+a_{2}x_{i}^{2}+\ldots +a_{n}x_{i}^{n}\right) \right) ^{2}}$
The coefficients are found by solving a system of equations derived using partial derivatives of SSE with respect to each coefficient. These equations are solved using a method called matrix algebra.
The polynomial regression problem can be written in matrix form:
y = Xa + ϵ
Here,
y = ${\begin{bmatrix} y_{1} \\ y_{2} \\ \vdots \\ y_{m} \end{bmatrix}}$ is the vector of dependent variables
X = ${{\begin{bmatrix}1 & x_{1} & \cdots & x_{1}^{n} \\ 1 & x_{2} & \cdots & x_{2}^{n} \\ \vdots & \vdots & \ddots & \vdots \\ 1 & x_{m} & \cdots & x_{m}^{n} \\ \end{bmatrix}}}$ is the design matrix
a = ${\begin{bmatrix} a_{0} \\ a_{1} \\ \vdots \\ a_{n} \end{bmatrix}}$ is the vector of the coefficients.
The least squares solution is given by: a = (XTX)-1XTy
Problem: Fitting a 2nd-degree POLYNOMIAL
Let us fit a 2nd-degree polynomial regression model y = a0 + a1x + a2x2 to the following dataset:
x | y |
1 | 2 |
2 | 4 |
3 | 9 |
4 | 16 |
Step 1. Setting Up the Design Matrix (X)
X = ${\begin{bmatrix} 1 & 1 & 1^{2} \\ 1 & 2 & 2^{2} \\ 1 & 3 & 3^{2} \\ 1 & 4 & 4^{2} \end{bmatrix}}$ = ${\begin{bmatrix} 1 & 1 & 1 \\ 1 & 2 & 4 \\ 1 & 3 & 9 \\ 1 & 4 & 16 \end{bmatrix}}$
Step 2. Setting Up the Dependent Variable Vector (y)
y = ${\begin{bmatrix} 2 \\ 4 \\ 9 \\ 16 \end{bmatrix}}$
Step 3. Solving for Coefficients (a)
a = (XTX)-1XTy
Here,
XT = ${\begin{bmatrix} 1 & 1 & 1 & 1 \\ 1 & 2 & 3 & 4 \\ 1 & 4 & 9 & 16 \end{bmatrix}}$
On multiplying XT and X, we get
XTX = ${\begin{bmatrix} 1 & 1 & 1 & 1 \\ 1 & 2 & 3 & 4 \\ 1 & 4 & 9 & 16 \end{bmatrix}\begin{bmatrix} 1 & 1 & 1 \\ 1 & 2 & 4 \\ 1 & 3 & 9 \\ 1 & 4 & 16 \end{bmatrix}}$
= ${\begin{bmatrix} 4 & 10 & 30 \\ 10 & 30 & 100 \\ 30 & 100 & 354 \end{bmatrix}}$ …..(i)
Now, calculating XTy, we get
XTy = ${\begin{bmatrix} 1 & 1 & 1 & 1 \\ 1 & 2 & 3 & 4 \\ 1 & 4 & 9 & 16 \end{bmatrix}\begin{bmatrix} 2 \\ 4 \\ 9 \\ 16 \end{bmatrix}}$ = ${\begin{bmatrix} 31 \\ 101 \\ 355 \end{bmatrix}}$ …..(ii)
As we know, the inverse of a matrix A, denoted by A-1, is defined as
${A^{-1}=\dfrac{1}{\det \left( A\right) }\cdot Adj\left( A\right)}$ …..(iii)
Here,
For A = ${\begin{bmatrix} a & b & c \\ d & e & f \\ g & h & i \end{bmatrix}}$ …..(iv)
det(A) = a(ei – fh) – b(di – fg) + c(dh – eg) …..(v)
Now, comparing matrices (i) and (iv), we have
a = 4, b = 10, c = 30, d = 10, e = 30, f = 100, g = 30, h = 100, and i = 354
Substituting these values in (v),
det(A) = 4[(30)(354) – (100)(100)] – 10[(10)(354) – (100)(30)] + 30[(10)(100) – (30)(30)]
⇒ det(A) = 4[10620 – 10000] – 10[3540 – 3000] + 30[1000 – 900]
⇒ det(A) = 4(620) – 10(540) + 30(100)
⇒ det(A) = 2480 – 5400 + 3000
⇒ det(A) = 80 …..(vi)
To find the cofactor matrix Adj(A), we calculate the determinant of each 2 × 2 matrix by removing the row and column of each element.
Calculating the first row of cofactors, we get
1. Cofactor of a11 (4):
Minor = ${\begin{bmatrix} 30 & 100 \\ 100 & 354 \end{bmatrix}}$
Determinant = (30)(354) – (100)(100) = 10620 – 10000 = 620
Thus, Cofactor = +620
2. Cofactor of a12 (10):
Minor = ${\begin{bmatrix} 10 & 100 \\ 30 & 354 \end{bmatrix}}$
Determinant = (10)(354) – (100)(30) = 3540 – 3000 = 540
Thus, Cofactor = -540
Note: The negative sign appears due to the alternating sign rule (-1)i + j, which ensures the correct orientation of the cofactor matrix.
3. Cofactor of a13 (30):
Minor = ${\begin{bmatrix} 10 & 30 \\ 30 & 100 \end{bmatrix}}$
Determinant = (10)(100) – (30)(30) = 1000 – 900 = 100
Thus, Cofactor = +100
Calculating the second row cofactors, we get
1. Cofactor of a21 (10):
Minor = ${\begin{bmatrix} 10 & 30 \\ 100 & 354 \end{bmatrix}}$
Determinant = (10)(354) – (30)(100) = 3540 – 3000 = 540
Thus, the cofactor = -540
2. Cofactor of a22 (30):
Minor = ${\begin{bmatrix} 4 & 30 \\ 30 & 354 \end{bmatrix}}$
Determinant = (4)(354) – (30)(30) = 1416 – 900 = 516
Thus, the cofactor = +516
3. Cofactor of a23 (100):
Minor = ${\begin{bmatrix} 4 & 10 \\ 30 & 100 \end{bmatrix}}$
Determinant = (4)(100) – (10)(30) = 400 – 300 = 100
Thus, the cofactor = -100
Calculating the third row cofactors, we get
1. Cofactor of a31 (30):
Minor = ${\begin{bmatrix} 10 & 30 \\ 30 & 100 \end{bmatrix}}$
Determinant = (10)(100) – (30)(30) = 1000 – 900 = 100
Thus, the cofactor = +100
2. Cofactor of a32 (100):
Minor = ${\begin{bmatrix} 4 & 30 \\ 10 & 100 \end{bmatrix}}$
Determinant = (4)(100) – (30)(10) = 400 – 300 = 100
Thus, the cofactor = -100
3. Cofactor of a33 (354):
Minor = ${\begin{bmatrix} 4 & 10 \\ 10 & 30 \end{bmatrix}}$
Determinant = (4)(30) – (10)(10) = 120 – 100 = 20
Thus, the cofactor = +20
Hence, the cofactor matrix = ${\begin{bmatrix} 620 & -540 & 100 \\ -540 & 516 & -100 \\ 100 & -100 & 20 \end{bmatrix}}$
Adj(A) = transpose of the cofactor matrix = ${\begin{bmatrix} 620 & -540 & 100 \\ -540 & 516 & -100 \\ 100 & -100 & 20 \end{bmatrix}^{T}}$
⇒ Adj(A) = ${\begin{bmatrix} 620 & -540 & 100 \\ -540 & 516 & -100 \\ 100 & -100 & 20 \end{bmatrix}}$ …..(vii)
Now, by substituting (vi) and (vii) in (iii), we get
A-1 = (XTX)-1 = ${\dfrac{1}{80}\begin{bmatrix} 620 & -540 & 100 \\ -540 & 516 & -100 \\ 100 & -100 & 20 \end{bmatrix}}$
⇒ (XTX)-1 = ${\begin{bmatrix} 7\cdot 75 & -6\cdot 75 & 1\cdot 25 \\ -6\cdot 75 & 6\cdot 45 & -1\cdot 25 \\ 1\cdot 25 & -1\cdot 25 & 0\cdot 25 \end{bmatrix}}$ …..(viii)
Now, using (ii) and (viii), we get
a = ${\begin{bmatrix} 7\cdot 75 & -6\cdot 75 & 1\cdot 25 \\ -6\cdot 75 & 6\cdot 45 & -1\cdot 25 \\ 1\cdot 25 & -1\cdot 25 & 0\cdot 25 \end{bmatrix}\begin{bmatrix} 31 \\ 101 \\ 355 \end{bmatrix}}$
⇒ a = ${\begin{bmatrix} 2\cdot 25 \\ -1\cdot 55 \\ 1\cdot 25 \end{bmatrix}}$
The coefficients are a0 = 2.25, a1 = -1.55, and a2 = 1.25
Thus, the polynomial regression equation is:
y = 2.25 – 1.55x + 1.25x2
Alternatively, polynomial regression can be solved computationally using Python or other tools instead of lengthy manual derivations.
Solve the polynomial regression for:
x = [1, 3, 5, 7]
y = [2, 10, 22, 38]
The polynomial regression model is y = a0 + a1x + a2x2
Here,
The design matrix for a third-degree polynomial regression is constructed as follows:
X = ${\begin{bmatrix} 1 & x_{1} & x_{1}^{2} \\ 1 & x_{2} & x_{2}^{2} \\ 1 & x_{3} & x_{3}^{2} \\ 1 & x_{4} & x_{4}^{2} \end{bmatrix}}$
Substituting the values of x = [1, 3, 5, 7],
⇒ X = ${\begin{bmatrix} 1 & 1 & 1^{2} \\ 1 & 3 & 3^{2} \\ 1 & 5 & 5^{2} \\ 1 & 7 & 7^{2} \end{bmatrix}}$
⇒ X = ${\begin{bmatrix} 1 & 1 & 1 \\ 1 & 3 & 9 \\ 1 & 5 & 25 \\ 1 & 7 & 49 \end{bmatrix}}$
The transpose of X is XT = ${\begin{bmatrix} 1 & 1 & 1 & 1 \\ 1 & 3 & 5 & 7 \\ 1 & 9 & 25 & 49 \end{bmatrix}}$
Now, XTX = ${\begin{bmatrix} 1 & 1 & 1 & 1 \\ 1 & 3 & 5 & 7 \\ 1 & 9 & 25 & 49 \end{bmatrix}\begin{bmatrix} 1 & 1 & 1 \\ 1 & 3 & 9 \\ 1 & 5 & 25 \\ 1 & 7 & 49 \end{bmatrix}}$
⇒ XTX = ${\begin{bmatrix} 4 & 16 & 84 \\ 16 & 84 & 480 \\ 84 & 480 & 3196 \end{bmatrix}}$
Now, XTy = ${\begin{bmatrix} 1 & 1 & 1 & 1 \\ 1 & 3 & 5 & 7 \\ 1 & 9 & 25 & 49 \end{bmatrix}\begin{bmatrix} 2 \\ 10 \\ 22 \\ 38 \end{bmatrix}}$
⇒ XTy = ${\begin{bmatrix} 72 \\ 408 \\ 2820 \end{bmatrix}}$ …..(i)
det(XTX) = 4[(84)(3196) – (480)(480)] – 16[(16)(3196) – (480)(84)] + 84[(16)(480) – (84)(84)]
⇒ det(XTX) = 31488
Now, for the cofactor matrix, we have
Cofactor of a11 (4):
Minor = ${\begin{bmatrix} 84 & 480 \\ 480 & 3196 \end{bmatrix}}$
Determinant = (84)(3196) – (480)(480) = 38064
Thus, the cofactor = +38064
Cofactor of a12 (16):
Minor = ${\begin{bmatrix} 16 & 480 \\ 84 & 3196 \end{bmatrix}}$
Determinant = (16)(3196) – (480)(84) = 10816
Thus, the cofactor = -10816
Cofactor of a13 (84):
Minor = ${\begin{bmatrix} 16 & 84 \\ 84 & 480 \end{bmatrix}}$
Determinant = (16)(480) – (84)(84) = 624
Thus, the cofactor = +624
Cofactor of a21 (16):
Minor = ${\begin{bmatrix} 16 & 84 \\ 480 & 3196 \end{bmatrix}}$
Determinant = (16)(3196) – (84)(480) = 10816
Thus, the cofactor = -10816
Cofactor of a22 (84):
Minor = ${\begin{bmatrix} 4 & 84 \\ 84 & 3196 \end{bmatrix}}$
Determinant = (4)(3196) – (84)(84) = 5728
Thus, the cofactor = +5728
Cofactor of a23 (480):
Minor = ${\begin{bmatrix} 4 & 16 \\ 84 & 480 \end{bmatrix}}$
Determinant = (4)(480) – (16)(84) = 576
Thus, the cofactor = -576
Cofactor of a31 (84):
Minor = ${\begin{bmatrix} 16 & 84 \\ 84 & 480 \end{bmatrix}}$
Determinant = (16)(480) – (84)(84) = 624
Thus, the cofactor = +624
Cofactor of a32 (480):
Minor = ${\begin{bmatrix} 4 & 84 \\ 16 & 480 \end{bmatrix}}$
Determinant = (4)(480) – (16)(84) = 576
Thus, the cofactor = -576
Cofactor of a33 (3196):
Minor = ${\begin{bmatrix} 4 & 16 \\ 16 & 84 \end{bmatrix}}$
Determinant = (4)(84) – (16)(16) = 80
Thus, the cofactor = +80
The cofactor matrix is ${\begin{bmatrix} 38064 & -10816 & 624 \\ -10816 & 5728 & -576 \\ 624 & -576 & 80 \end{bmatrix}}$
Thus, Adj(XTX) = ${\begin{bmatrix} 38064 & -10816 & 624 \\ -10816 & 5728 & -576 \\ 624 & -576 & 80 \end{bmatrix}}$
(XTX)-1 = ${\dfrac{1}{31488}\begin{bmatrix} 38064 & -10816 & 624 \\ -10816 & 5728 & -576 \\ 624 & -576 & 80 \end{bmatrix}}$
⇒ (XTX)-1 = ${\begin{bmatrix} 1\cdot 21 & -0\cdot 34 & 0\cdot 02 \\ -0\cdot 34 & 0.18 & -0.02 \\ 0.02 & -0.02 & 0\cdot 003 \end{bmatrix}}$ …..(ii)
Now, using (i) and (ii), we get
a = (XTX)-1XTy
⇒ a = ${\begin{bmatrix} 1\cdot 21 & -0\cdot 34 & 0\cdot 02 \\ -0\cdot 34 & 0.18 & -0.02 \\ 0\cdot 02 & -0.02 & 0\cdot 003 \end{bmatrix}\begin{bmatrix} 72 \\ 408 \\ 2820 \end{bmatrix}}$
⇒ a = ${\begin{bmatrix} -0\cdot 5 \\ 2 \\ 0\cdot 5 \end{bmatrix}}$Thus, the quadratic regression equation is y = -0.5 + 2x + 0.5x2
Last modified on November 23rd, 2024