When medicinal chemists need to improve oral bioavailability (%F) during lead optimization, they systematically modify compound properties mainly based on their own experience and general rules of thumb. However, at least a dozen properties can influence %F, and the difficulty of multi-parameter optimization for such complex non-linear processes grows combinatorially with the number of variables. Furthermore, strategies can be in conflict. For example, adding a polar or charged group will generally increase solubility but decrease permeability. Identifying the 2 or 3 properties that most influence %F for a given compound series would make %F optimization much more efficient. We previously reported an adaptation of physiologically-based pharmacokinetic (PBPK) simulations to predict %F for a lead series from purely computational inputs within a 2-fold average error. Here, we run thousands of such simulations to generate a comprehensive “bioavailability landscape” for the series. A key innovation was recognition that the large and variable number of pKas in drug molecules could be replaced by just the two straddling the isoelectric point. Another was use of the ZINC database to cull out chemically inaccessible regions of property space. A quadratic Partial Least Squares regression (PLS) accurately fits a continuous surface to these thousands of bioavailability predictions. The PLS coefficients indicate the globally sensitive compound properties. The PLS surface also displays the %F landscape in these sensitive properties locally around compounds of particular interest. Finally, being quick to calculate, the PLS equation can be combined with models for activity and other properties for multi-objective lead optimization.
By Pankaj R Daga, Michael B Bolger, Robert D Clark