The performance of QSAR models has traditionally been evaluated in terms of aggregate statistics – sensitivity, specificity, root mean square error (RMSE), R2, etc. – for some kind of test set. More recently, the fact that models are generally more reliable for compounds that are similar to those included in the training set than for those which are dissimilar has led to the concept of “applicability domain.” A simple binary categorization of a compound as being inside or outside of a model’s applicability domain is often too coarse for regulatory purposes, however, and often for lead optimization purposes as well. One way to address this shortcoming is to find ways to relate the degree of consensus among the multiple predictions within an ensemble model to the degree of error expected for the consensus prediction, i.e., to estimate the predictive uncertainty. We have found that overdisperse distributions can be used to do this: beta binomials for ensemble classification models and gamma distributions for ensemble regression models.
15th BioIT World Conference & Expo, May 23-25, 2017, Needham, MA
By Robert D Clark