Results of a new classification algorithm combining K nearest neighbors and recursive partitioning

Authors: Miller DW
Publication: J Chem Inf Model
Software: MedChem Studio™

Abstract

We present results of a new computational learning algorithm combining favorable elements of two well-known techniques:  K nearest neighbors and recursive partitioning. Like K nearest neighbors, the method provides an independent prediction for each test sample under consideration, while like recursive partitioning, it incorporates an automatic selection of important input variables for model construction. The new method is applied to the problem of correctly classifying a set of chemical data samples designated as being either active or inactive in a biological screen. Training is performed at varying levels of intrinsic model complexity, and classification performance is compared to that of both K nearest neighbor and recursive partitioning models trained using the identical protocol. We find that the cross-validated performance of the new method outperforms both of these standard techniques over a considerable range of user parameters. We discuss advantages and drawbacks of the new method, with particular emphasis on its parameter robustness, required training time, and performance with respect to chemical structural class.