Nonlinear Classification
The original optimal hyperplane algorithm proposed by Vapnik in 1963 was a linear classifier. However, in 1992, Bernhard E. Boser, Isabelle M. Guyon and Vladimir N. Vapnik suggested a way to create nonlinear classifiers by applying the kernel trick (originally proposed by Aizerman et al.) to maximum-margin hyperplanes. The resulting algorithm is formally similar, except that every dot product is replaced by a nonlinear kernel function. This allows the algorithm to fit the maximum-margin hyperplane in a transformed feature space. The transformation may be nonlinear and the transformed space high dimensional; thus though the classifier is a hyperplane in the high-dimensional feature space, it may be nonlinear in the original input space.
If the kernel used is a Gaussian radial basis function, the corresponding feature space is a Hilbert space of infinite dimensions. Maximum margin classifiers are well regularized, so the infinite dimensions do not spoil the results. Some common kernels include:
- Polynomial (homogeneous):
- Polynomial (inhomogeneous):
- Gaussian radial basis function:, for Sometimes parametrized using
- Hyperbolic tangent:, for some (not every) and
The kernel is related to the transform by the equation . The value w is also in the transformed space, with Dot products with w for classification can again be computed by the kernel trick, i.e. . However, there does not in general exist a value w' such that
Read more about this topic: Support Vector Machine