Rethinking Scientific Workflows

Rethinking Scientific Workflows

Developing a disease drug model is an intensely creative and collaborative effort. It requires the ability to assemble available knowledge and data and to gain a collective appreciation of important relationships. As this collaborative synthesis gets underway, pharmacometricians are charged with translating the ideas and hypotheses about diseases biology and drug pharmacology into mathematical equations. The equations are then coded into the control streams that, along with the data, become the basis for investigating the feasibility of various hypotheses.

The investigative process has long been the workaday challenge for modelers. Strategies for organizing the various control streams, documenting the sequence of events that led to an insight, and accessing the computer processing power required to fit models to data are as varied as the modelers themselves.

Pharmacometricians are among the many scientific disciplines that have benefited from Moore’s Law. The exponential increase in processing power, and the declining costs, have made it possible to work with moderately complex disease drug models and large datasets in a reasonable time frame. But access to processing power is only one component of a successful modeling program. In fact, the more complexity in the models and the more challenging the datasets, the harder it will be to sustain and evolve a successful pharmacometrics organization within the R&D enterprise.

While we now have access to adequate computing power for most applications, we lack the tools required to properly manage the investigative process. We need the tools to organize the results of the numerous model configurations, sort through the results and compare alternative formations of the model in order to determine reasonable next steps and decide that a particular model or group of models is adequate for the task at hand. And this is where the idea of “workflow” comes to the forefront.

Briefly, workflows capture the data transformations and analysis steps as well as the mechanisms to carry them out over the course of an analysis.The representation of the workflow contains the many details required to carry out each analysis step, including the data flows, use of specific execution and storage resources in distributed environments. If properly constructed, the explicit representation of the computational processes would allow the analysis process to be better managed and automated. Importantly, workflows must also capture the provenance information necessary of scientific reproducibility, result publication, and result sharing among collaborators, both internal and external to the organization.

The implementation requirements for scientific workflow in pharmacometrics has slowly evolved as model complexity has focused on the need to effectively communicate results to regulatory agencies. In the past,  the submission of control streams and results for critical modeling steps was adequate. Now, regulators are demanding a greater level of detail including comprehensive run-records that detail the path from initial modeling formulation to final model specification.

At Cognigen, we work on 30 to 40 different modeling projects each year and the challenges of managing these projects has led to KIWI – a comprehensive, validated platform to efficiently and consistently organize, process, evaluate, and communicate results of pharmacometric analyses.  We systematically examined the requirements, constraints, and standard practice across our pharmacometrics, data management, administrative support, and IT departments in order to build critical functionality in an elegant framework to support the emerging complexity of scientific workflows in pharmacometric data analysis.  For more information on KIWI (link to blog category results) or Contact Us for KIWI Licensing Information. (link to form)