Support vector machine (SVM)
Saturday, June 28th, 2008SVM is based on the structural risk minimization principle from statistical learning theory (Vapnik 1995; Burges 1998; Evgeniou et al. 2001). A compound is represented by a vector xi which is its molecular descriptors. In linearly separable cases, SVM constructs a hyperplane which separates two data classes of compounds with a maximum margin. This is accomplished by finding another vector w and a parameter b that minimizes
and satisfies the following conditions:
where yi is the data class index of compound i, w is a vector normal to the hyperplane,
is the perpendicular distance from the hyperplane to the origin and
is the Euclidean norm of w. After the determination of w and b, a given compound with vector x can be classified by:
In non-linearly separable cases, SVM maps the vectors into a higher dimensional feature space using a kernel function K(xi, xj). The table below lists three different types of kernel functions which are commonly used. The Gaussian radial basis function kernel has been extensively used in a number of different studies with good results (Burbidge et al. 2001; Czerminski et al. 2001; Trotter et al. 2001).
Commonly used kernel functions
| Kernel | Equation |
| Polynomial | ![]() |
| Gaussian radial basis function | ![]() |
| Sigmoidal | ![]() |
Linear support vector machine is applied to this feature space and then the decision function is given by:
where l is the number of support vectors and the coefficients alphai0 and b are determined by maximizing the following Langrangian expression:
under the following conditions:
where C is a penalty for training errors. A positive or negative value from decision function equation indicates that the compound with vector x belongs to the positive or negative data class respectively.
References
- Burbidge R, Trotter M, Buxton B and Holden S (2001). Drug design by machine learning: support vector machines for pharmaceutical data analysis. Computers and Chemistry 26(1): 5-14.
- Burges CJC (1998). A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery 2(2): 127-167.
- Czerminski R, Yasri A and Hartsough D (2001). Use of support vector machine in pattern classification: Application to QSAR studies. Quantitative Structure-Activity Relationships 20(3): 227-240.
- Evgeniou T and Pontil M (2001). Support vector machines: theory and applications. Machine learning and its applications. Advanced lectures. Paliouras G, Karkaletsis V and Spyropoulos CD. New York, Springer: 249-257.
- Trotter MWB, Buxton BF and Holden SB (2001). Support vector machines in combinatorial chemistry. Measurement and Control 34(8): 235-239.
- Vapnik VN (1995). The nature of statistical learning theory. New York, Springer.






















