Performance evaluation of a QSAR/qSAR model
One of the objectives of QSAR/qSAR modeling is to allow prediction of the activities of compounds which have not been biologically tested. Thus it is important to determine the ability of the developed QSAR/qSAR model to predict the activities of compounds that are not present in the training set. There are two methods which are commonly used to determine the predictive capability of a QSAR/qSAR model (Wold et al. 1995). The first method is the use of cross-validation, which includes leave-one-out (LOO) and k-fold cross-validation. In LOO, a compound is left out of the training set and the remaining compounds are used to train the machine learning method. The derived QSAR/qSAR model is then used to predict the activity of the left-out compound. This process is repeated until every compound in the training set has been left out once. In k-fold cross-validation, the training set was randomly divided into k mutually exclusive subsets of approximately equal size. k-minus-one of the subsets were combined to form a modeling training set for developing a QSAR/qSAR model. The remaining subset was used as a modeling testing set to assess the predictive capability of the QSAR/qSAR model. This process was repeated until k QSAR/qSAR models were developed and each subset had been used as a modeling testing set once.
There are reports of the lack of correlation between cross-validation methods and the prediction capability of a QSAR/qSAR model (Golbraikh et al. 2002; Kozak et al. 2003; Reunanen 2003; Olsson et al. 2004). Moreover, cross-validation methods have a tendency of underestimating the prediction capability of a QSAR/qSAR model, especially if important molecular features are present in only a minority of the compounds in the training set (Mosier et al. 2002; Hawkins et al. 2004). Thus a model having low cross-validation results can still be quite predictive (Mosier et al. 2002). This lead to some studies which suggests that an independent validation set may provide a more reliable estimate of the prediction capability of a QSAR/qSAR model (Wold et al. 1995; Golbraikh et al. 2002). Despite these disadvantages, cross-validation methods are still useful for assessing QSAR/qSAR models during optimization of parameters of machine learning methods and during descriptor selection.
A validation set should ideally be obtained independently of the training set. However, validation sets are usually constructed by using statistical molecular design because of the limited availability of high-quality activity data. Regardless of the method used to obtain a validation set, a good validation set should be representative of the training set so that it can properly assess the prediction capabilities of the QSAR/qSAR model (Tropsha et al. 2003).
References
- Golbraikh A and Tropsha A (2002). Beware of q2! Journal of Molecular Graphics and Modelling 20(4): 269-276.
- Hawkins DM, Basak SC and Mills D (2004). Assessing model fit by cross-validation. Journal of Chemical Information and Computer Sciences 43(2): 579-586.
- Kozak A and Kozak R (2003). Does cross validation provide additional information in the evaluation of regression models? Canadian Journal of Forest Research 33(6): 976-987.
- Mosier PD and Jurs PC (2002). QSAR/QSPR studies using probabilistic neural networks and generalized regression neural networks. Journal of Chemical Information and Computer Sciences 42(6): 1460-1470.
- Olsson I-M, Gottfries J and Wold S (2004). D-optimal onion designs in statistical molecular design. Chemometrics and Intelligent Laboratory Systems 73(1): 37-46.
- Reunanen J (2003). Overfitting in making comparisons between variable selection methods. Journal of Machine Learning Research 3: 1371-1382.
- Tropsha A, Gramatica P and Gombar VK (2003). The importance of being earnest: Validation is the absolute essential for successful application and interpretation of QSPR models. QSAR & Combinatorial Science 22(1): 69-77.
- Wold S and Eriksson L (1995). Statistical validation of QSAR results. Chemometric methods in molecular design. van de Waterbeemd H. Weinheim; New York; Basel; Cambridge; Tokyo, VCH: 309-318