Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

MATLAB : Partition the dataset into 3 groups: 80% for training and cross validat

ID: 3597240 • Letter: M

Question

MATLAB: Partition the dataset into 3 groups: 80% for training and cross validation (to be split later) and 20% for testing. Hint: use cvpartition with the 'holdout' option, holding out 20% of the dataset. Call your variables dataTrain (173-by-4000), grpTrain (173-by-1), dataTest (43-by-4000), and grpTest (43-by-1).

Variables           Value

grp               216x1 cell

obs              216x4000 single

What i have so far, unsure if this is right, any help would be appreciated, keep in mind im using MATLAB

holdoutCVP = cvpartition(grp,'holdout',56);

dataTrain = obs(holdoutCVP.training,:);

grpTrain = grp(holdoutCVP.training);

Explanation / Answer

c = cvpartition(n,'HoldOut',p) partitions the dataset into training and test sets, where p is the percentage of test set items, if p is given as integer, it'll be number of test-set items

Here, n=216

20% of 216= 43.2, considering 43

c = cvpartition(grp,'holdout',43);

dataTrain = obs(training(c),:);

grpTrain = grp(training(c));

dataTest = obs(test(c),:);

grpTest = grp(test(c));

where training(c) returns the logical vector(0 if absent, 1 if present) of training indices for an object c

similary,test(c) for test indices