Skip to content

Latest commit

 

History

History
24 lines (21 loc) · 2 KB

Cross-Validation-Wrapper.md

File metadata and controls

24 lines (21 loc) · 2 KB

CROSS-VALIDATION-WRAPPER

AIMA3e

function CROSS-VALIDATION-WRAPPER(Learner, k, examples) returns a hypothesis
local variables: errT, an array, indexed by size, storing training-set error rates
        errV, an array, indexed by size, storing validation-set error rates
for size = 1 todo
   errT[size], errV[size] ← CROSS-VALIDATION(Learner, size, k, examples)
   if errT has converged then do
     best_size ← the value of size with minimum errV[size]
     return Learner(best_size, examples)


function CROSS-VALIDATION(Learner, size, k, examples) returns two values:
fold_errT ← 0; fold_errV ← 0
for fold = 1 to k do
   training_set, validation_set ← PARTITION(examples, fold, k)
   hLearner(size, training_set)
   fold_errTfold_errT + ERROR-RATE(h, training_set)
   fold_errVfold_errV + ERROR-RATE(h, validation_set)
return fold_errTk, fold_errVk


Figure ?? An algorithm to select the model that has the lowest error rate on validation data by building models of increasing complexity, and choosing the one with best empirical error rate on validation data. Here errT means error rate on the training data, and errV means error rate on the validation data. Learner(size, exmaples) returns a hypothesis whose complexity is set by the parameter size, and which is trained on the examples. PARTITION(examples, fold, k) splits examples into two subsets: a validation set of size Nk and a training set with all the other examples. The split is different for each value of fold.