Regression Analysis
Regression Analysis is a set of statistical processes for estimating the relationships between a dependent variable and one or more independent variables.
The term regression was originally applied to describe biological processes which ‘regress’ toward normative values. Today, we can think of data points regressing toward a mathematically derived normative line or shape.
For example, in the diagram below, the linear regression line represents a dependent variable y based on the independent variables x.
A Generalized Equation
A very generalized regression analysis equation is:
Dependent Variable
The dependent variable is the result of the regression analysis. It can be in the form of data points, lines, curves. In the diagram above, it’s represented by the blue line.
Function
The function is used to generate the result. In the diagram above, it is a linear function generating a line. Examples of regression analysis functions are shown below. See Linear Regression for a more detailed function example.
Independent Variable(s)
The independent variables are the data from which the dependent variable is derived. In the diagram above, these are represented by the red dots.
Unknown Parameters
Unknown parameters of the function are set during model algorithm training to produce the optimal dependent variable result.
Error
Because the dependent variable is only an estimation of the independent variables, an error value is a result of function processing. In the diagram above, the error results from the y axis distance between the red dots and blue line.
Examples of Regression Analysis Functions
Examples of types of regression analysis include:
Binomial Regression - regression analysis in which the dependent variable has a binomial distribution, which is like a normal distribution but with discreet values
Lasso Regression - (least absolute shrinkage and selection operator) regression method that performs variable selection and regularization to improve accuracy
Linear Regression - a linear approach to modeling the relationship between a dependent variable and one or more independent variables
Nonlinear Regression - a form of regression analysis in which the dependent variable is modeled by a non-linear function
Polynomial Regression - a form of regression analysis in which the dependent variable is modeled as an nth degree polynomial
Ridge Regression - provides improved efficiency in parameter estimation problems in exchange for a tolerable amount of bias (see bias–variance tradeoff)
Regression vs. Classification
Regression produces continuous number predictions. Classification produces discrete category predictions.