What is a regression in Machine Learning?

Regression in ML is when the model predicts a numeric value "Y" based on an input "X".

The predicted output value "Y" is called Dependent Variable, because its value depends on the input "X", which is the Independent Variable.

Regression can usually be:

  • Simple Linear Regression (one input predicts one output)
y = b0 + b1*x1

when we TRAIN the model:
y = dependent variable (known, taken from training set)
x = independent variable (known, taken from training set)
b = coefficient (not known, adjusted during fitting phase to fit the x and y values)

when we TEST the model:
y = dependent variable (not known, predicted value)
x = independent variable (known, value from test set)
b = coefficient (known from before training, applied to x to find y)
How the training and test sets are used in Simple Linear Regression

Simple Linear Regression Plot, what the Model predicts vs Real Values

  • Multiple Linear Regression (more inputs predict one output)
y = b0 + b1*x1 + b2*x2 + ... + bn*xn

As you can see here we have "n" number of "X"s. This is why is called multiple: you have more independent variables that modulate the dependent variable's output "y".
How the model uses Multiple Independent Variables to predict the Dependent Variable "y"

To plot the multiple linear regression result on a 2D plot, you need to make a dimension-reduction (because you have x1,x2,y, you need to have only 2-dims like x1,y or x2,y).

  • Polynomial Linear Regression (parabolic functions/predictions)
y = b0 + b1*x1^n + b2*x1^n + ... + bn*x1^n

Here we rise to the power of "n" our "X", this way we can make regressions in a  parabolic way, instead of a straight line.
It is still called linear regression because of the "b"s.
Polynomial Linear Regression

Because the "b"s are unknown, our goal is to find those values so we can plug-in "x" and predict "y".

That's why linear/non-linear refers to coefficients.
Example of non-linear:

y = b0 + b1*x1 / b2 + x2
Non-Linear Function y=b0+b1*x1/b2+x2

You can't replace coefficients with others to turn the equation into a linear one, in regards to the coefficients "b", not the "x" values.
In this cases we can use solutions such as Support Vector Regression (SVR).

Thanks to SuperDataScience for their ML courses on Udemy!

A simple Machine Learning exercise in Python using Seaborn

Here you can find a Tensorflow tutorial with explanations!

python template simple linear regression