Training Set is a subset of the dataset used to build predictive models.
Validation Set is a subset of the dataset used to assess the performance of model built in the training phase
– It provides a test platform for fine-tuning model’s parameters and selecting the best performing model
– Not all modeling algorithms need a validation set
Test set or unseen examples is a subset of the dataset to assess the likely future performance of a model.
– If a model fits the training set much better than it fits the test set. Overfitting is probably the cause
Binary Classification(two class classification)
true|false, 1|0, -1|+1, male|female
Multi-class classification problems can be seen as binary classification problems.