In statistical learning theory, or sometimes computational learning theory, the VC dimension (for Vapnik–Chervonenkis dimension) is a measure of the capacity of a statistical classification algorithm, defined as the cardinality of the largest set of points that the algorithm can shatter. It is a core concept in Vapnik–Chervonenkis theory, and was originally defined by Vladimir Vapnik and Alexey Chervonenkis.
Informally, the capacity of a classification model is related to how complicated it can be. For example, consider the thresholding of a high-degree polynomial: if the polynomial evaluates above zero, that point is classified as positive, otherwise as negative. A high-degree polynomial can be wiggly, so it can fit a given set of training points well. But one can expect that the classifier will make errors on other points, because it is too wiggly. Such a polynomial has a high capacity. A much simpler alternative is to threshold a linear function. This function may not fit the training set well, because it has a low capacity. We make this notion of capacity more rigorous below.
Read more about VC Dimension: Shattering, Uses
Famous quotes containing the word dimension:
“By intervening in the Vietnamese struggle the United States was attempting to fit its global strategies into a world of hillocks and hamlets, to reduce its majestic concerns for the containment of communism and the security of the Free World to a dimension where governments rose and fell as a result of arguments between two colonels wives.”
—Frances Fitzgerald (b. 1940)