AI That Learns Physics, Not Words, Jumps From Galaxies to Fluids

Polymathic AI's Walrus (fluids) and AION-1 (astronomy) learn physics from real data, so they generalize across fields. They boost sparse-data studies and speed up research.

Categorized in: AI News Science and Research
Published on: Dec 10, 2025
AI That Learns Physics, Not Words, Jumps From Galaxies to Fluids

AI Trained on Physics: Walrus and AION-1 Are Pushing Discovery Across Fields

Most AI models learn from text or photos. A new wave is learning from physics itself - and it's already paying off for research.

Scientists in the Polymathic AI collaboration introduced two foundational models trained on real scientific datasets: Walrus for fluidlike systems and AION-1 for astronomy. Because they learn general physical behavior, they can transfer what they learn in one area to make sense of another - from exploding stars to Wi-Fi signals to bacterial motion.

Why physics-trained foundation models matter

Unlike task-specific models, foundational models learn broad physical patterns across many datasets and experiments. That lets them generalize, speed up workflows, and hold up better when data is scarce or noisy.

  • Stronger performance in low-data and rare-event regimes
  • Faster turnaround from raw data to usable results
  • Transfer of physical knowledge across fields and instruments

"Maybe you have new physics in your scenario that your field isn't used to handling," says Walrus lead developer Michael McCabe, a research scientist at Polymathic AI. "Our hope is that training on these broader classes makes something that is both easier to use and has a better chance of generalizing."

AION-1: Foundation model for astronomy

AION-1 is trained on massive survey data: the Legacy Survey, HSC, SDSS, DESI and Gaia - over 200 million observations and roughly 100 TB. It learns from images, spectra and auxiliary measurements across stars, quasars and galaxies.

In practice, that means when you have a low-resolution image of a galaxy, AION-1 can infer far more about it by drawing on physics learned from millions of other galaxies. Liam Parker, a Ph.D. student at UC Berkeley and a lead researcher on AION-1, notes the approach is particularly helpful when data are sparse or rare events dominate.

Walrus: Fluids and fluidlike systems

Walrus targets fluids and systems that behave like fluids. It's trained on The Well - a 15 TB collection spanning 19 scenarios and 63 fields in fluid dynamics, with parameters such as density, velocity and pressure.

Use cases range wide: merging neutron stars, acoustic waves and stratified layers in Earth's atmosphere. The same learned physics lets Walrus model signals and biological flows too. As Polymathic AI's principal investigator Shirley Ho puts it, "It's like seeing many, many humans… when you meet a new friend, because you've met so many people before now, you are able to map… what this human is going to be like compared to all your friends before."

What this means for your research

With these models, you don't start from zero. You begin with a physics-aware embedding that preserves structure across instruments, resolutions and noise levels.

  • Start with a pretrained model instead of building a pipeline from scratch
  • Fine-tune on your instrument or simulation specifics
  • Leverage cross-domain priors when your data are limited
  • Use the models as fast surrogates for expensive simulations

"Our vision is that it enables anyone to start from a really powerful embedding of the data they're interested in … and still achieve state-of-the-art accuracy without having to build this whole pipeline from scratch," says Parker. Or as the AION-1 team wrote, multiple senses together give a fuller picture - if one is missing, the others can fill in the gaps.

Highlights from NeurIPS 2025

  • CosmoBench: A multiview, multiscale, multitask cosmology benchmark for geometric deep learning with 34,000+ point clouds and 25,000 directed trees. It includes tasks spanning cosmological methods, linear models and graph neural networks.
  • Lost in Latent Space: Latent diffusion modeling used to generate high-quality images at lower computational cost while preserving physical behavior - a practical route to cheaper, accurate surrogates.
  • Neurons as Detectors of Coherent Sets: Evidence that sensory neurons separate streams into "coherent sets" that either encode the recent past or predict near-future input, suggesting new directions for biologically inspired AI and potential treatments in mental health.
  • Predicting Partially Observable Dynamical Systems: A probabilistic, diffusion-based approach that incorporates distant past information to infer hidden solar processes and produce an ensemble of plausible futures for sunspot activity.

Practical next steps

  • Map your problem to the closest domain (fluids vs. sky surveys) and start from Walrus or AION-1.
  • Assemble a small, clean validation set to check transfer quality and bias before scaling.
  • Fine-tune only the last layers first; if needed, unfreeze more and add domain losses.
  • Compare against your baseline simulation or pipeline to quantify gains in accuracy, speed and cost.

For preprints and related work, browse arXiv. Conference materials and schedules are available at NeurIPS.

If you're building AI skills for research teams, see our curated learning paths by role at Complete AI Training.


Get Daily AI News

Your membership also unlocks:

700+ AI Courses
700+ Certifications
Personalized AI Learning Plan
6500+ AI Tools (no Ads)
Daily AI News by job industry (no Ads)
Advertisement
Stream Watch Guide