Penn Engineers Solve Hard Math Problems With Smaller, Faster AI Model
Researchers at the University of Pennsylvania developed a method called mollifier layers that makes artificial intelligence systems better at solving inverse partial differential equations-a class of problems where scientists work backward from observations to find hidden causes.
The work, published in Transactions on Machine Learning Research and set to be presented at NeurIPS 2026, takes a different approach than the current trend toward larger models and more computing power. Instead, the team reworked a mathematical idea from the 1940s for use in physics-informed machine learning.
The Core Problem
Inverse PDEs are everywhere in science. Researchers in weather systems, biology, and materials science often have the visible result-a shifting pattern, a temperature field, a cellular structure-but not the hidden rules that produced it.
Vivek Shenoy, a materials science professor at Penn, described the challenge simply: "Solving an inverse problem is like looking at ripples in a pond and working backward to figure out where the pebble fell. You can see the effects clearly, but the real challenge is inferring the hidden cause."
Conventional AI systems handle these problems by calculating derivatives through recursive automatic differentiation, a process that repeatedly traces how values change through a neural network. This works until the equations involve higher-order derivatives and noisy data. Then the system becomes memory-hungry, slow, and unstable.
In one test using physics-informed neural networks on a fourth-order reaction-diffusion problem, peak memory use climbed from 0.21 gigabytes to 2.70 gigabytes. Training time also ballooned. Accuracy suffered too-in another benchmark, a standard model recovered a mathematical derivative with a correlation of only 0.21 against the correct answer.
A Smoother Approach
The Penn team traced the bottleneck not to the neural network design itself, but to automatic differentiation. Their solution uses mollifiers, a mathematical smoothing technique described in the 1940s by mathematician Kurt Otto Friedrichs.
Instead of taking unstable derivatives directly from the network's output, the mollifier layer smooths the signal first, then calculates derivatives through fixed mathematical operations. This shifts the hardest differentiation work away from repeated gradient calculations through the full network.
Vinayak Vinayak, a doctoral candidate and co-author, said: "Modern AI often advances by scaling up computation. But some scientific challenges require better mathematics, not just more compute."
What the Tests Showed
The team tested mollifier layers on three types of problems: a first-order equation, a second-order heat equation, and a fourth-order reaction-diffusion system. The hardest case-fourth-order derivatives-is where conventional methods typically struggle most.
For the first-order Langevin equation, the mollified model achieved a temporal correlation of 0.97, compared with 0.36 for a standard model, while using less memory (0.16 gigabytes versus 0.21) and less training time (1,615 seconds versus 2,138).
The gap widened in harder systems. For the second-order heat equation, the mollified model reached a spatial correlation of 0.99, while the standard model came in at 0.21. Peak memory dropped from 1.20 gigabytes to 0.24 gigabytes.
In the fourth-order reaction-diffusion benchmark, the mollified model cut training time from 3,386 seconds to 335 seconds and reduced peak memory from 2.75 gigabytes to 0.23 gigabytes. The accuracy for inferring hidden parameters rose from 0.44 to 0.99.
Across their experiments, mollifier layers reduced memory footprint and training time by 6 to 10 times.
Application to Cell Biology
For Shenoy's lab, the biological payoff comes from chromatin-the mix of DNA and proteins that packages chromosomes inside the cell nucleus.
Tiny chromatin domains, about 100 nanometers in size, help regulate access to genetic material. That matters because accessibility influences gene expression, and gene expression shapes cell identity, function, aging, and disease.
The team had been studying how chemical reactions and physical interactions organize chromatin structure. What they needed was a more reliable way to infer the reaction rates behind those changes from what could actually be observed under a microscope.
Using mollifier layers, the researchers recovered spatially varying reaction rates with high accuracy from noisy microscope images, including super-resolution STORM images of human cell nuclei. They captured the mathematical properties more faithfully than standard methods.
Vinayak said: "If we can track how these reaction rates evolve during aging, cancer or development, this creates the potential for new therapies. If reaction rates control chromatin organization and cell fate, then altering those rates could redirect cells to desired states."
Where Else This Could Help
Inverse PDEs appear in fields far beyond biology. Materials science, fluid mechanics, genetics, and weather modeling all involve estimating hidden quantities-diffusivity, conductance, reaction rates-from sparse or noisy measurements.
The Penn framework could make stable, efficient inference practical in any of these areas. The researchers also suggest the same principle might extend to forward models, operator learning, and neural ODE systems, all areas where accurate gradients matter.
Known Limitations
The method's performance depends on choosing the right smoothing kernel, which must balance noise suppression against the risk of losing important details. The current implementation also has weaknesses near boundaries and on certain types of grids.
The researchers say future work should explore adaptive kernels, boundary-aware formulations, and validation strategies for more complex grids. They are clear that the mathematical improvement helps, but does not eliminate the need for careful tuning on real datasets.
For researchers working with difficult scientific data, the practical benefit is clear: a way to extract hidden rules with fewer failures when data is noisy or derivatives are hard to compute. As Shenoy said, "If you understand the rules that govern a system, you now have the possibility of changing it."
Learn more about AI for Science & Research applications and methodologies.
Your membership also unlocks: