Courses
Courses for Kids
Free study material
Offline Centres
More
Store Icon
Store

Linear Regression Formula

Reviewed by:
ffImage
hightlight icon
highlight icon
highlight icon
share icon
copy icon
SearchIcon

Introduction

Ever been to a shop and have noticed how the size of an object directly affects its price as well? Well, a relation is seen when two quantities are compared and there is either an increase or decrease in the value of both of them or it can also be that one quantity increases while the other decreases and vice versa. If these two quantities are further plotted on a graph, it is observed that there is a linear relation between them.  Linear regression formula helps to define this linear relation that is present between the two quantities and how they are interdependent.

 

Linear regression is known to be the most basic and commonly used predictive analysis. In this concept, one variable is considered to be an explanatory variable, and the other variable is considered to be a dependent variable. For example, a modeller might want to relate the weights of individuals to their heights using the concept of linear regression.

 

Simple Linear Regression

  • One is the dependent variable (that is interval or ratio).

  • One is the independent variable (that is interval or ratio or dichotomous).

 

Multiple Linear Regression

  • One is the dependent variable (that is interval or ratio).

  • Two or more independent variables ( that is interval or ratio or dichotomous).

 

Logistic Regression

  • One is the dependent variable (that is binary).

  • Two or more independent variable(s) ( that is interval or ratio or dichotomous).

 

Ordinal Regression

  • One is the dependent variable (that is ordinal).

  • One or more independent variable(s) (that is nominal or dichotomous).

 

Multinomial Regression

  • One is the dependent variable (that is nominal).

  • One or more independent variable(s) (that is interval or ratio or dichotomous).

 

Discriminant Analysis

  • One is the dependent variable (that is nominal).

  • One or more independent variable(s) (that is interval or ratio).

 

What is Linear Regression?

Let’s know what linear regression is. It is very important and used for easy analysis of the dependency of two variables. One variable will be considered to be an explanatory variable, while others will be considered to be a dependent variable. Linear regression is a linear method for modelling the relationship between the independent variables and dependent variables. The linearity of the learned relationship makes the interpretation very easy. Linear regression models have long been used by people as statisticians, computer scientists, etc. who tackle quantitative problems. For example, a statistician might want to relate the weights of individuals to their heights using a linear regression model. Now we know what linear regression is.

 

The Formula of Linear Regression

Let’s know what a linear regression equation is. The formula for linear regression equation is given by:

y = a + bx

 

a and b can be computed by the following formulas:

b= \[\frac {n\sum xy - (\sum x)(\sum y)} {n\sum x^2 - (\sum x)^2}\]


a= \[\frac {\sum y - b(\sum x)} {n}\]

Where

x and y are the variables for which we will make the regression line.

  • b =  Slope of the line.

  • a =  Y-intercept of the line.

  • X  = Values of the first data set.

  • Y = Values of the second data set.

 

Note: The first step in finding a linear regression equation is to determine if there is a relationship between the two variables. This is often a judgment call for the researcher. You’ll also need a list of your data in an x–y format (i.e. two columns of data – independent and dependent variables).

 

Simple Linear Regression Formula Plotting

Table 1. Example data.

X

Y

1.00

1.00

2.00

2.00

3.00

1.30

4.00

3.75

5.00

2.25

 

(Image will be uploaded soon)

 

The concept of linear regression consists of finding the best-fitting straight line through the given points. The best-fitting line is known as a regression line. The black diagonal line in the figure given below (Figure 2) is the regression line and consists of the predicted score on Y for each possible value of the variable X. The lines in the figure given above, the vertical lines from the points to the regression line, represent the errors of prediction. As you can see, the red point is actually very near the regression line; we can see its error of prediction is small. By contrast, the yellow point we can see is much higher than the regression line and therefore its error of prediction is large.

 

(Image will be uploaded soon)

 

The black line given in the figure consists of the predictions, the points that are the actual data, and the vertical lines between the points and the black line represent errors of prediction.

 

Properties of Linear Regression

For the regression line where the regression parameters b0 and b1 are defined, the properties are given as below:

  • The line reduces the sum of squared differences between observed values and predicted values.

  • The regression line passes through the mean of X and Y variable values.

  • The regression constant (b0) is equal to the y-intercept of the linear regression.

  • The regression coefficient (b0) is the slope of the regression line which is equal to the average change in the dependent variable (Y) for a unit change in the independent variable (X).

 

What is Linear Regression Used for?

Linear regression is used for: 

  • The concept of studying engine performance from test data in automobiles.

  • Linear regression can be used in market research studies and customer survey results analysis.

  • Linear regression can be used in observational astronomy commonly enough. A number of statistical tools and methods can be used in astronomical data analysis, and there are entire libraries in languages like Python meant to do data analysis in astrophysics.

  • Linear regression can also be used to analyze the marketing effectiveness, pricing, and promotions on sales of a product.

 

Questions to be Solved

Question 1) Find out the linear regression equation from the given set of data.

X

2

3

5

8

Y

3

6

5

12

 

Solution:

X

Y

X2

XY

2

3

4

6

3

6

9

18

5

5

25

25

8

12

64

96

Sum  = 18

Sum = 26

Sum =102

Sum = 145

 

Using the simple linear regression formula,

 

b= \[\frac {n\sum xy - (\sum x)(\sum y)} {n\sum x^2 - (\sum x)^2}\] ,

 

b= \[\frac {4\times 145-18\times 26} {4\times 102 -324}\] , Value of b is equal to 1.33.

 

Now using the simple linear regression formula to calculate the value of a=\[\frac {\sum y-b(\sum x)} {n}\]= \[\frac {26 - 1.33\times 18} {4}\] = 0.515

 

Putting the values of a and b in the equation, y = a + bx

Answer: y = 0.515 + 1.33x.

 

Standard Error in Linear Regression Formula:

The standard error that is seen about the regression line can be defined as the measure of the average proportion that the regression equation over- or under-predicts. This standard error is denoted by SE. The higher the coefficient of the determination being involved, the lower the standard error and hence, a more accurate result will be available.

FAQs on Linear Regression Formula

1. What is a linear regression with an example?

Linear regression quantifies the relationship between one or more predictor variable(s) and one outcome variable. For example, it can be used to quantify the relative impacts of age, gender, and diet (the predictor variables) on height (the outcome variable).

2. How do you calculate linear regression?

The Linear Regression Equation : The equation has the form Y= a + bX, where Y is the dependent variable (that's the variable that goes on the Y-axis), X is the independent variable (i.e. it is plotted on the X-axis), b is the slope of the line, and a is the y-intercept.

3. How do you Calculate the Y-Intercept?

Using the "slope-intercept" form of the line's equation (y = mx + b), you solve for b (which is the y-intercept you're looking for). You need to substitute the known slope for the variable m, and substitute the known point's coordinates for x and y, respectively, in the slope intercept equation. That will help you find b.

4. What is a Regression Model Example?

A simple linear regression plot for the amount of rainfall. Regression analysis can also be used in statistics to find trends in data (insights). For example, you might guess that there's a connection between how much you eat and how much you weigh; regression analysis can help you quantify that.

5. What are the prerequisites needed for regression analysis using the Linear Regression Formula?

The regression analysis using the linear regression formula is valid only when the following conditions have been satisfied:

1. The dependent variable Y should have a linear relationship that will be independent of variable X. To check this, it should be made sure that the XY scatter plot will be linear and that the residual plot will show a random pattern.

2. For each of the values of X, the probability of Y has the same standard deviation. When the condition is being satisfied, the variability of the residuals will be relatively constant over all the values of X that have been considered which can be easily checked out through a residual plot.

6. What is the coefficient of determination for a linear regression model?

The coefficient of determinations is one of the main results of regression analysis. The properties of the coefficient of determination can be given as follows:

1. The coefficient determination will range from 0 to 1.

2. A coefficient determination that has a value of 0 will mean that the dependent variable cannot be easily predicted from the independent variable.

3. If the coefficient determination has a value of 1 will mean that the dependent variable can be easily predicted without any errors from the independent variable.

4. The range of coefficient determination from 0 to 1 hence provides the extent to which the dependent variable will be predictable.