Train-Validation-Test split link
- Training dataset (dataset 1 x K) is used to train a few candidate models
- Validation dataset (dataset 2 x K) is used to evaluate the candidate models
- One of the candidates is chosen
- The chosen model is trained with a new training dataset (dataset 3 = all the data used in steps 1 & 2)
- The trained model is evaluated with the test dataset (dataset 4: an unseen dataset)
In steps 1 and 2 (called cross validation), evaluate each model K times with different dataset and take the average score for the decision at step 3. These K datasets are ideally different, or we can use k-fold cross validation.