MSU Researchers Use Machine Learning to Speed Drug Discovery for Liver Cancer and Lung Disease
Researchers at Michigan State University have demonstrated that machine learning can predict how chemical compounds will affect gene expression, potentially accelerating the search for new treatments. The team trained an algorithm on millions of published experimental measurements to identify promising compounds for hepatocellular carcinoma and idiopathic pulmonary fibrosis.
The work, published in Cell, represents a shift in how drug discovery operates. Instead of screening millions of chemicals against hundreds or thousands of genes manually, the model predicts gene effects based solely on a compound's chemical structure.
How the System Works
The researchers created a "Gene expression profile Predictor on chemical Structures," or GPS, trained on vast amounts of data. The approach mirrors image classification-a neural network learning to identify cats or dogs-but applied to biology. The model learns whether a compound will increase or decrease expression of a specific gene.
A key challenge was handling messy biological data. Jiayu Zhou, a senior author formerly at MSU and now at the University of Michigan, said the team developed methods to separate strong signals from weak ones. "Biological data are rarely clean," Zhou said. "Imagine trying to learn from a huge pile of examples where some are clear, some are fuzzy, and some may even be misleading."
Testing on Two Diseases
The team selected two diseases with urgent clinical needs. Hepatocellular carcinoma is the third leading cause of cancer-related death worldwide. Idiopathic pulmonary fibrosis, a chronic lung disease, has a median survival of three years after diagnosis.
For liver cancer, researchers tested compounds on mice and identified two new candidates that reduced tumor size. For lung disease, the team found one repurposed drug and two new compounds showing promise in mouse models and human lung tissue samples.
Human tissue came from explants provided by Corewell Health's lung transplant program in Grand Rapids, the busiest in Michigan. The program has ample samples because pulmonary fibrosis is the leading reason patients require lung transplants.
The Role of Collaboration
The project involved over 20 researchers across multiple disciplines. Bin Chen, an associate professor at MSU's College of Human Medicine, said the interdisciplinary approach was essential. The team included computer scientists, bench scientists, and clinicians working together.
Edmund Ellsworth, director of MSU's Medicinal Chemistry Facility, emphasized that drug discovery requires diverse expertise. "Drug discovery is a team sport, and not for the faint of heart," Ellsworth said. "It's complicated, all sorts of things happen, and you need the diversity of experts to overcome and be successful."
Reda Girgis, a pulmonologist and medical director of the transplant program, said the study shows the value of clinicians working alongside biologists and computational researchers. "That is really key to advance research," Girgis said.
Next Steps
The compounds identified still require validation in living organisms before clinical trials. The team has made its code and a web portal available to other researchers, allowing them to use GPS for virtual compound screening on other diseases.
Chen said the platform could accelerate discovery across multiple conditions. "I want people really to be able to use it to discover new therapeutics," he said.
The research received funding from the National Institutes of Health, the National Science Foundation, and Michigan State University, among other sources.
For professionals interested in how machine learning applies to scientific discovery, AI for Science & Research covers data modeling and computational methods in laboratory settings.
Your membership also unlocks: