AI "WarEngine" concept: Air Force targets super real-time wargaming
The Air Force Futures directorate is scouting AI-driven modeling and simulation that can run wargames at super real-time speeds - up to 10,000x faster than live. The aim: deliver Decision Superiority and a coherent Force Design by replacing disconnected, vendor-locked tools with a unified, scientific approach.
This RFI points to a digital ecosystem where humans and AI explore millions of branches, stress-test plans, and turn insights into tasking orders quickly. For government leaders, program managers, and developers, this is a clear signal: speed, scale, and explainability are now baseline requirements.
What the Air Force is asking for
- Super real-time simulation: up to 10,000x real-time to compress days into minutes and iterate fast.
- Reinforcement learning agents: AI that learns realistic adversary behavior and can compete with humans, teams, and other agents.
- Event-driven, agent-based simulation: every entity acts autonomously across all domains, reacting to real-time events.
- Scale and access: hundreds of users, tens of thousands of entities, and operations across Unclassified, Secret, and TS/SCI/SAP.
- Physics-based adjudication: engagements and effects grounded in validated physics, not guesswork.
- Secondary effects: model EW and cyber friction, degraded comms, and logistics impacts that ripple through the fight.
- COA generation and ranking: neuro-symbolic methods for scoring options by risk, resources, and mission tradeoffs.
- Decision support: pause at key inflection points; show outcome fields, risk corridors, and visual clusters from millions of branches.
- LLMs for execution and analysis: auto-generate tasking orders to joint standards; real-time transcription and diarization of commander discussions.
- "WarEngine" platform: modular, cloud-based integration hub with transparent, repeatable APIs; interoperate with AFSIM, NGTS, and JICM.
Why this matters for government, IT, and development teams
Traditional wargames are slow, expensive, and hard to repeat. A digital-first approach lets teams iterate COAs, test assumptions under stress, and quantify risk with evidence instead of intuition.
For program offices, this means faster analysis cycles and clearer traceability from model inputs to decisions. For engineering teams, it means building systems that can explain outcomes, survive classification boundaries, and integrate with legacy M&S tools without lock-in.
Practical build considerations
- Architecture: event-driven microservices with deterministic simulation kernels and reproducible runs; strong time management and scheduling.
- Agents: RL pipelines that support self-play, curriculum learning, and safety constraints; robust baselining against human tactics.
- Adjudication: validated physics libraries; tunable fidelity for speed vs. accuracy; versioned models for auditability.
- Secondary effects: composable EW/cyber/logistics modules that inject friction into C2, ISR, and sustainment.
- Decision tooling: interactive pause/resume with branching visualization; uncertainty quantification and confidence intervals.
- Explainability and governance: neuro-symbolic scoring with rationale, data lineage, model cards, and red-team test suites.
- Security and cross-domain: plan for IL2-IL6 deployment footprints, RMF/STIG alignment, and consistent behavior across enclaves.
- Interoperability: clean APIs, message schemas, and translators for AFSIM/NGTS/JICM; automated data import/export and scenario replay.
- Scalability: hundreds of concurrent users, tens of thousands of entities; predictable performance at high concurrency.
The "WarEngine" integration hub
The requested platform acts as the connective tissue for digital wargaming. It should make complex simulations usable by operators while giving analysts transparent, repeatable workflows.
Think straightforward scenario setup, modular plug-ins, standardized outputs, and simple ways to compare COAs across runs. Consistency and reproducibility matter as much as raw speed.
LLMs in the loop
Language models are expected to turn selected COAs into joint-standard tasking orders and capture qualitative data during events. That means prompt safety, strict formatting, and guardrails to avoid hallucinations are essential.
Teams will need structured templates, schema validation, and human-in-the-loop review at key points - especially where LLM output flows into operational products.
Timeline and next steps
Responses to the RFI are due Jan. 9. Vendors should map capabilities directly to the requirements above, show quantitative benchmarks (speedup vs. real time, user concurrency, entity counts), and present evidence of physics validation and explainability.
Expect questions on classification posture, integration with AFSIM/NGTS/JICM, and how your APIs support transparency and repeatability.
Helpful resources
Upskilling for teams building this
If your roadmap includes RL agents, neuro-symbolic decision tools, and LLM-backed workflows, aligning your team's skills will pay off quickly. See curated AI learning paths by role here: Complete AI Training - Courses by Job.
Your membership also unlocks: