So I\'m reading \"Introduction to Machine Learning\" 2nd edition, by Bishop, et.
ID: 649018 • Letter: S
Question
So I'm reading "Introduction to Machine Learning" 2nd edition, by Bishop, et. all. On page 27 they discuss the Vapnik-Chervonenkis Dimension which is,
"The maximum number of points that can be shattered by H [the hypothesis class] is called the Vapnik-Chervonenkis (VC) Dimension of H, is denoted VC(H) and measures the capacity of H."
Whereas "shatters" indicates a hypothesis h?H for a set of N data points such that it separates the positive examples from the negative. In such an example it is said that "H shatters N points".
So far I think I understand this. However, the authors lose me with the following:
"For example, four points on a line cannot be shattered by rectangles."
There must be some concept here I'm not fully understanding, because I cannot understand why this is the case. Can anyone explain this to me?
Explanation / Answer
The definition of "a set P can be shattered by rectangles" is that for every subset of P, there is a rectangle that contains precisely that subset and excludes the rest of P. Equivalently, every labeling of the points as positive and negative is consistent with at least one hypothesis in H.
Now consider four points p,q,r,s along a line in the plane. Since there is no rectangle that contains p and r but excludes q and s, these four points cannot be shattered by rectangles.
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.