In Silico Prediction of Major Drug Clearance Pathways by Support Vector Machines with Feature-Selected Descriptors
We have previously established an in silico classification method (“CPathPred”) to predict the major clearance pathways of drugs based on an empirical decision with only four physicochemical descriptors-charge, molecular weight, octanol-water distribution coefficient, and protein unbound fraction in plasma-using a rectangular method. In this study, we attempted to improve the prediction performance of the method by introducing a support vector machine (SVM) and increasing the number of descriptors. The data set consisted of 141 approved drugs whose major clearance pathways were classified into metabolism by CYP3A4, CYP2C9, or CYP2D6; organic anion transporting polypeptide-mediated hepatic uptake; or renal excretion. With the same four default descriptors as used in CPathPred, the SVM-based predictor (named “default descriptor SVM”) resulted in higher prediction performance compared with a rectangular-based predictor judged by 10-fold cross-validation. Two SVM-based predictors were also established by adding some descriptors as follows: 1) 881 descriptors predicted in silicofrom the chemical structures of drugs in addition to 4 default descriptors (“885 descriptor SVM”); and 2) selected descriptors extracted by a feature selection based on a greedy algorithm with default descriptors (“feature selection SVM”). The prediction accuracies of the rectangular-based predictor, default descriptor SVM, 885 descriptor SVM, and feature selection SVM were 0.49, 0.60, 0.72, and 0.91, respectively, and the overall precision values for these four methods were 0.72, 0.77, 0.86, and 0.98, respectively. In conclusion, we successfully constructed SVM-based predictors with limited numbers of descriptors to classify the major clearance pathways of drugs in humans with high prediction performance.