AI makes the first star-by-star Milky Way simulation-100 billion stars in months, not decades

An AI-assisted Milky Way simulation tracks 100+ billion stars about 100x faster while keeping small-scale physics. It can cover a billion years in roughly 115 days.

Categorized in: AI News Science and Research
Published on: Nov 24, 2025
AI makes the first star-by-star Milky Way simulation-100 billion stars in months, not decades

Milky Way, Star by Star: An AI-Assisted Simulation That Runs in Months

A team in Japan has built the first Milky Way simulation that tracks more than 100 billion individual stars while preserving key small-scale physics. By pairing a deep learning surrogate with standard N-body and hydrodynamics, they reached star-by-star resolution and finished runs over 100 times faster than previous top-tier models.

The work, led by researchers at the RIKEN Center for Interdisciplinary Theoretical and Mathematical Sciences (iTHEMS) with collaborators from The University of Tokyo and Universitat de Barcelona, was presented at SC '25. It marks a decisive step for galaxy modeling and for any field that needs to connect microphysics to global dynamics.

What's new

The simulation follows individual stars across a 10,000-year window per run and preserves feedback from supernovae and gas dynamics that typically force tiny timesteps. A deep learning surrogate model learned how gas expands for 100,000 years after a supernova, offloading the most costly microphysics while the main solver handles gravity and large-scale flow.

Results were checked against large-scale runs on RIKEN's Fugaku and The University of Tokyo's Miyabi systems, showing close agreement. This approach keeps fidelity while removing the usual timestep bottleneck.

  • Scale: >100 billion stars; 100x the detail of earlier models
  • Speed: 1 million years simulated in 2.78 hours (vs. 315 hours)
  • Throughput: 1 billion years in ~115 days (vs. ~36 years)
  • Training target: 100,000-year post-supernova gas expansion learned by the surrogate
  • Validation: Benchmarks on Fugaku and Miyabi confirmed accuracy

Why simulating every star is hard

Galaxy evolution couples processes that vary across orders of magnitude in time and space-gravity, fluid flow, supernova feedback, and element formation. To resolve individual stars and events like shock fronts, solvers need small timesteps, which drives up cost dramatically.

Earlier galaxy-scale models capped out at about a billion solar masses with "star particles" that each stood in for ~100 stars. That averaging hides the fine-grained behavior researchers care about most.

How the method works

The team trained a deep learning surrogate on high-resolution supernova simulations so it could reproduce gas expansion and energy injection over 105 years without invoking the full microphysics at every step. The main code advances gravity and hydrodynamics, calling the surrogate where local conditions match its training domain.

This hybrid division keeps timesteps manageable while retaining the local fidelity needed to model feedback correctly. Cross-checks against full-physics tests on national supercomputers showed consistent large-scale structure and small-scale flow features.

Implications for research

For astrophysics, this unlocks star-by-star tests of galaxy assembly, stellar feedback, and the production of heavy elements, with outputs that can be compared against survey data. It also allows long-duration experiments at practical cost, so teams can iterate instead of waiting years for a single run.

The same template applies to other multi-scale problems. Weather, ocean, and climate models can slot in surrogates for expensive subgrid processes, then reserve the heavy numerics for global dynamics where they matter most.

Practical takeaways for HPC and AI teams

  • Use surrogates for the stiffest microphysics (e.g., feedback, turbulence closures) and let the main solver handle global dynamics.
  • Train on targeted, high-resolution patches that bracket the operating regime; validate with held-out scenarios.
  • Embed invariants and constraints (e.g., conservation laws) into training and runtime checks to prevent drift.
  • Trigger the surrogate conditionally-only where local states match training support; fall back to full physics elsewhere.
  • Measure energy and scaling, not just wall clock: more cores do not scale cleanly for tiny timesteps.
  • Plan for data throughput-surrogate calls are cheap, but diagnostics, I/O, and analysis can become the new limiters.

Quote

"I believe that integrating AI with high-performance computing marks a fundamental shift in how we tackle multi-scale, multi-physics problems across the computational sciences. This achievement also shows that AI-accelerated simulations can move beyond pattern recognition to become a genuine tool for scientific discovery-helping us trace how the elements that formed life itself emerged within our galaxy."

Reference

The First Star-by-star N-body/Hydrodynamics Simulation of Our Galaxy Coupling with a Surrogate Model, SC '25: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. DOI: 10.1145/3712285.3759866.

Further reading

Build the skills to do this work

If you're assembling an AI-accelerated simulation stack-surrogate modeling, validation workflows, and HPC deployment-curated course lists can help you move faster: AI courses by skill.


Get Daily AI News

Your membership also unlocks:

700+ AI Courses
700+ Certifications
Personalized AI Learning Plan
6500+ AI Tools (no Ads)
Daily AI News by job industry (no Ads)
Advertisement
Stream Watch Guide