October 6, 2020
Machine learning and artificial intelligence holds tremendous potential for the future of healthcare: accelerating the development of new and more effective therapies for patients, enabling more personalized treatment, and improving the efficiency of healthcare delivery—but only if the future healthcare workforce is capable of using these tools.
As part of an effort to help create this machine learning-knowledgeable future workforce, the Accelerating Therapeutics for Opportunities in Medicine (ATOM) consortium started a new, and entirely virtual, training experience for pharmacy students.
This summer, instead of working at a traditional hospital or store pharmacy internship, five Butler University Doctor of Pharmacy students delved into the world of data science and machine learning under the mentorship of experts from the ATOM consortium. Through a partnership with the Frederick National Laboratory for Cancer Research (FNL), a founding member of ATOM, each Butler students was paired with an ATOM mentor and worked with real-world, publicly available data and helped curate datasets to support ATOM’s machine learning-driven drug discovery platform.
This innovative and collaborative training opportunity supports ATOM’s mission to accelerate drug discovery and helps build a future workforce with integrated expertise in data science, machine learning, and drug discovery—a critical need to transition the pharmaceutical and healthcare industry toward an AI-driven approach.
“AI and data science are transforming what is going on in the pharmaceutical industry,” said Eric Stahlberg, director of the FNL Biomedical Informatics and Data Science and ATOM co-lead. “This experience gave the students a sense of what that future will be like.”
Andrea de Souza, Senior Director, Data Sciences and Engineering, Lilly Information and Digital Solutions, who helped connect Butler University with ATOM, noted there is often a skillset mismatch between traditional college curriculum and what employers need, particularly in the drug discovery domain. She suggested R and Python proficiency will soon be expected in the workforce, as Microsoft Word and Excel are today.
The students learned how to apply data science tools to analyze and prepare chemical datasets for ATOM models. The training experience concluded with a virtual seminar on July 24, in which the students presented their project to the ATOM team.
The students had little prior exposure to machine learning. One of the students, Laura Fisher, said she had no prior experience in computer science, but she had welcomed this unique training opportunity to “improve my critical thinking skills and help me gain a deeper understanding of machine learning technology and how it impacts health care.” It was clear from her presentation that she had met this goal as everyone attending the seminar—from ATOM, Butler, and the FNL—lauded all the students’ projects and presentations.
The five students and their projects are described below:
Paige Cowden completed the project “Data Curation for a Mitochondrial Membrane Potential Model” under the mentorship of Amanda Paulson, FNL.
Laura Fischer completed the project “Open Data and Model Fitting with AMPL” under the mentorship of Hiran Ranganathan, Lawrence Livermore National Laboratory (LLNL).
Connor Reyd Miller completed the project “Working with Open Datasets” under the mentorship of Yaru Fan, LLNL, and Ben Madej, FNL.
Logan Van Ravenswaay completed the project “Visualize Data: A Python Function to Generate Interactive Plots and Accelerate Exploratory Data Analysis” under the mentorship of Ben Madej, FNL.
Chris Zeheralis completed the project “Open Cancer and Infectious Disease Datasets” under the mentorship of George Zaki, FNL (with support from Ravi Ravichandran, FNL).
For their projects, the trainees all worked with publicly available data, applying their new computer science skills and existing pharmaceutical knowledge to characterize and curate the datasets. Curated datasets are critical for fitting predictive machine learning models for applications such as virtual screening and lead optimization with the ATOM platform.
Or as Connor Miller succinctly stated during his presentation, “More data equals better models.”
“This program reflects the trend that data science approaches are spreading across industries from pharmaceutical research and development to the healthcare industry,” said Ben Madej, a data scientist with FNL and ATOM. “The experiences the interns have had at ATOM will certainly transfer beyond their summer projects.”