Quantitative Estimation Of Predictive Uncertainty

Introduction

The performance of QSAR models has traditionally been evaluated in terms of aggregate statistics – sensitivity, specificity, root mean square error (RMSE), R2, etc. – for some kind of test set. More recently, the fact that models are generally more reliable for compounds that are similar to those included in the training set than for those which are dissimilar has led to the concept of “applicability domain.” A simple binary categorization of a compound as being inside or outside of a model’s applicability domain is often too coarse for regulatory purposes, however, and often for lead optimization purposes as well. One way to address this shortcoming is to find ways to relate the degree of consensus among the multiple predictions within an ensemble model to the degree of error expected for the consensus prediction, i.e., to estimate the predictive uncertainty. We have found that overdisperse distributions can be used to do this: beta binomials for ensemble classification models and gamma distributions for ensemble regression models.

Artificial neural network ensemble (ANNE) classification and regression models were built in ADMET Modeler™ using 2D molecular property descriptors. All networks within an ensemble (ANNE) model share a common set of input descriptors and have the same number of neurons, but each ANN is trained on a separate partition of the training pool into training and verification sets (an external test set was held out of training). A network’s performance on its verification set was monitored to prevent overtraining. Predictions for ANNE classification models were obtained by counting the number of positive network “votes” and comparing that vote tally to a threshold. A beta binomial F was fibed to the cumulative distribution of errors across the number (tally) k of networks in the ensemble that cast a positive “vote” for that prediction. A second beta binomial G was fibed to the cumulative distribution of predictions across the tally of positive votes.

17th International Conference on QSAR in Environmental & Health Sciences, June 13-17, 2016, Miami Beach, FL

By Robert D. Clark, Marvin Waldman, Robert Fraczkiewicz

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Simulations Plus

Quantitative Estimation Of Predictive Uncertainty

Introduction

Contact Us