Computational Biology log: Cross Validation and training-validation-test split

Train-Validation-Test split link

Training dataset (dataset 1 x K) is used to train a few candidate models
Validation dataset (dataset 2 x K) is used to evaluate the candidate models
One of the candidates is chosen
The chosen model is trained with a new training dataset (dataset 3 = all the data used in steps 1 & 2)
The trained model is evaluated with the test dataset (dataset 4: an unseen dataset)

In steps 1 and 2 (called cross validation), evaluate each model K times with different dataset and take the average score for the decision at step 3. These K datasets are ideally different, or we can use k-fold cross validation.

Computational Biology log

News

New paper!

Thursday, June 30, 2022

Cross Validation and training-validation-test split

free swimbi unregistered