Quantitative Estimation Of Predictive Uncertainty

Conference: QSAR
Software: ADMET Predictor®
Division: Simulations Plus

Introduction

The performance of QSAR models has traditionally been evaluated in terms of aggregate statistics – sensitivity, specificity, root mean square error (RMSE), R2, etc. – for some kind of test set. More recently, the fact that models are generally more reliable for compounds that are similar to those included in the training set than for those which are dissimilar has led to the concept of “applicability domain.” A simple binary categorization of a compound as being inside or outside of a model’s applicability domain is often too coarse for regulatory purposes, however, and often for lead optimization purposes as well. One way to address this shortcoming is to find ways to relate the degree of consensus among the multiple predictions within an ensemble model to the degree of error expected for the consensus prediction, i.e., to estimate the predictive uncertainty. We have found that overdisperse distributions can be used to do this: beta binomials for ensemble classification models and gamma distributions for ensemble regression models.

Artificial neural network ensemble (ANNE) classification and regression models were built in ADMET Modeler™ using 2D molecular property descriptors. All networks within an ensemble (ANNE) model share a common set of input descriptors and have the same number of neurons, but each ANN is trained on a separate partition of the training pool into training and verification sets (an external test set was held out of training). A network’s performance on its verification set was monitored to prevent overtraining. Predictions for ANNE classification models were obtained by counting the number of positive network “votes” and comparing that vote tally to a threshold. A beta binomial F was fibed to the cumulative distribution of errors across the number (tally) k of networks in the ensemble that cast a positive “vote” for that prediction. A second beta binomial G was fibed to the cumulative distribution of predictions across the tally of positive votes.

17th International Conference on QSAR in Environmental & Health Sciences, June 13-17, 2016, Miami Beach, FL

By Robert D. Clark, Marvin Waldman, Robert Fraczkiewicz