In statistical learning theory, or sometimes computational learning theory, the VC dimension (for Vapnik–Chervonenkis dimension) is a measure of the capacity of a statistical classification algorithm, defined as the cardinality of the largest set of points that the algorithm can shatter. It is a core concept in Vapnik–Chervonenkis theory, and was originally defined by Vladimir Vapnik and Alexey Chervonenkis.
Informally, the capacity of a classification model is related to how complicated it can be. For example, consider the thresholding of a high-degree polynomial: if the polynomial evaluates above zero, that point is classified as positive, otherwise as negative. A high-degree polynomial can be wiggly, so it can fit a given set of training points well. But one can expect that the classifier will make errors on other points, because it is too wiggly. Such a polynomial has a high capacity. A much simpler alternative is to threshold a linear function. This function may not fit the training set well, because it has a low capacity. We make this notion of capacity more rigorous below.
Read more about VC Dimension: Shattering, Uses
Famous quotes containing the word dimension:
“God cannot be seen: he is too bright for sight; nor grasped: he is too pure for touch; nor measured: for he is beyond all sense, infinite, measureless, his dimension known to himself alone.”
—Marcus Minucius Felix (2nd or 3rd cen. A.D.)