The term “linearity” in algebra refers to a linear relationship between two or more variables. Linear Regression is a machine learning-based algorithm that models a predicted value depending upon the independent variables. It is based on supervised learning.
The above graph describes the percentage that can be scored by the students according to their hours of study. Linear Regression refers to the task that performs the prediction of dependent variable value (Percentage Score) based on the given independent variable value( Hours Studied).
There are two types of Linear Regression: Simple and Multiple Linear Regression.and Robust Regression
As the name suggests multiple linear regression predicts the dependent variable value based on multiple independent variable values.
The Hypothesis function for Linear Regression is:
b0 = y-intercept (the point where the graph intersects the y-axis)
b1 = coefficient value.
x = independent variable
y = estimate of dependent variable.
For multiple regression, b1, b2, b3 .. are the coefficient values for their x values.
We update the b0 and b1 values to get the best fit for the regression line. When we finally get the best regression line and use our model for prediction, it will predict the value of y for the input value of x.
The accuracy of the regression line is calculated by finding the Root Mean Square Error value between the predicted y value and true y value. It is calculated as follows:
The RMSE value should be minimized to get the best regression line. So, it is very important to update the values of b0 and b1. The model uses Gradient Descent to get the best b0 and b1 values. It is a trial and error method that starts with random b0 and b1 values and iteratively updating further to h=get the best values.
If the data contains outliers, then linear regression may not give accurate predictions.
Robust Regression is the alternative for such problems.