Linear Regression: Linear Regression Equation & OLS

Linear regression is one of the oldest algorithm in machine learning. It is an approach for modelling the relationship between a variable y (aka response or dependent variable) and one or more independent variables X. The independent variables are often called explanatory variables or features.

When only one explanatory variable is available linear regression is named simple linear regression. On the other hand when multiple explanatory variables are given, it is called multiple linear regression.

The goal of linear regression is to approximated the y - X relationship with lines that best fits the data. The y - X relationship is assumed to be linear:

Where Xi is a (k,1) feature vector and B is a (k,1) slope vector. Our goal is to find the value of the slopes (β1, β2,...βn) and the intersect (β0) that best fit the data. Best fitting the data usually means achieving the smallest error, the difference between the predicted y of our model and the true, observed y.

Assumption of the linear regression model

The data is normally distributed
The relationship between the dependent and independent variables is linear
No or little multicollinearity. This is a situation in which an explanatory variable can be linearly predicted from the others explanatory variables
No auto-correlation: a variable is not correlated with a delayed copy of itself.
Homoscedasticity: the error (noise) in the data is the same across all values of the independent variables.

PART1 - LINEAR DATA

The data we will work with in the first half of this article is shown below and it has only one independent variable, X:

Figure 1. Observed Data - Linear

The scatter plot depicts a clear linear relationship between X and y, with some noise. Therefore, the data will be modeled by: y = β0 + X * β1.

How do we calculate slope, β1, and intercept, β0? There are several ways to calculate them. In this article I will introduce two of these methods. You can find the Jupyter notebook of this article here HERE.

1 - Linear Regression Equation

Because our model is simple, we can calculate β1 and β0 straight with the statistical closed form formula:

The result is:

Figure 2. B0 and B1 estimate from linear regression equation

The estimated value of β0 and β1 is very close to the true value. Next, let's overlay the fitted line on the original data:

Figure 3. Fitted line from closed form solution

The fitted line nicely follows the data points and the MSE is relatively low, indicating an overall good fit between the line and observed data.

2 - Ordinary least square - OLS

OLS is a method that estimates β1 and β0 by minimizing the sum of the squares of the differences between observed and predicted values of y. Let's label observed and predicted values y as:

OLS linear regression will find the β1 and β0 that minimize the following loss:

Ordinary least squares linear regression is implemented in the LinearRegression class of Sklearn:

Figure 4. B0 and B1 estimate with Sklearn OLS

OLS linear regression also did a good job at estimating β0 and β1. The fitted line accurately follow the data points as well:

Figure 5. Fitted line form Sklearn OLS

PART2 - POLYNOMIAL DATA

In the previous section the linear regression model was represented by a 1st order polynomial equation, i.e. we dealt with a linear model. In other words the highest power of the independent variable X was 1 (no square, cube, ...etc). Next, we will learn how to fit a linear regression model on polynomial equations with order higher than 1. Figure 6 depicts a situation like that.

Figure 6. Higher order polynomial equation

Clearly, y is not a linear function of X, but y could be modeled as a linear combination of the powers of X. What is shown in Figure 6 is probably a cubic equation and we can tell that by the presence of 2 inflection points, at around X = -5 and around X = 0. Therefore, we can model y as:

We will utilize again the LinearRegression class of Sklearn to estimate β0, β1, β2 and β3, but before fitting the model, the square and cube of X must be calculated and concatenate together with X:

Figure 7. Addition of square and cube of X to the original X

Now, we can fit the LinearRegression, and estimate β0, β1, β2 and β3:

Figure 8. βs estimate with Sklearn OLS fitted on polynomial equation

and the fitted line is:

Figure 9. OLS Fitted line on polynomial data.

We were able to precisely fit a line to the data using OLS, even though the data did not generate from a first order polynomial. The guess that the equation was cubic turned out to be correct, at least for the given interval of X.

Concluding remarks

This was a short introduction on linear regression. We learned how to calculate slope and intercept using the linear regression equation and the LinearRegression class of Sklearn. Follow me on Twitter and Facebook to stay updated.

#Python #MachineLearning #OLS #LinearRegression