what is a training set? what is a testing set?
Machine Learning models work on datasets. The model observes the data and "learns" the patterns. This way the model learns to make numerical predictions (regression article link) or class classifications (classification article link)
A data set is a collection of data. -wikipedia
For example, we can take an Age-Salary dataset, and you can feel how the higher the age, the higher the salary.
To make our model learn this pattern, we split the dataset into training and test sets:
We can fit our model to the training set, this way the model is training and learning the patterns. Now it knows that the more you grow the more you gain.
The test set is used later: for each Age in the test set the model will predict a Salary, without knowing the real test set's Salary value.
The predictions made are based on the things learned in the training part.
The test set's real Salary values are used later to evaluate our model: we see how accurate it is (how many values it predicted correctly).