Energy Department plots national labs data consortium to fuel self-improving AI for science and engineering

DOE is forming a public-private consortium to pool data and train AI for science and engineering. A new RFI asks for input on data quality, cloud access, and model development.

Categorized in: AI News Science and Research

Published on: Nov 14, 2025

DOE plans public-private data-curation consortium for science and engineering AI

The Department of Energy is planning a public-private consortium to aggregate scientific data across national laboratories and train "self-improving" AI models for science and engineering. The agency issued a new request for information asking how to structure the effort and make models available via cloud resources to government, academia, and industry.

The goal is straightforward: unify high-value datasets, modernize data preparation, and enable model access that accelerates discovery and engineering workflows. The RFI also invites partners across think tanks, investors, research organizations, and AI developers with advanced model capabilities.

What DOE is asking for

The RFI seeks practical guidance on how to build and run the consortium so it actually delivers usable models and data at scale. It centers on mobilizing the labs, fixing data quality at the source, and streamlining access.

How to mobilize national labs to partner with industry without slowing research momentum.
How to ensure data is structured, cleaned, and preprocessed for training and evaluation.
How to design a consortium that supports many scientific and technical disciplines.
How to provide AI models to the scientific community using cloud programs and infrastructure.

The RFI also asks for recommendations on developing leading-edge models that use DOE data, facilities, and expertise, plus a call for interested partners that can contribute proven capabilities.

Policy backdrop and scope

The administration's AI Action Plan released in July emphasized energy-focused initiatives, national lab collaboration, and a nationwide buildout of AI-ready data centers. It directs DOE, NSF, NIST, and other federal partners to invest in automated cloud-enabled labs and to encourage researchers to release more high-quality datasets.

That aligns with the RFI's emphasis on shared infrastructure, open access where possible, and clear incentives for data contribution. A recent DOE request also sought proposals to expand data center capacity and energy infrastructure at Oak Ridge National Laboratory, signaling the compute and power footprint this effort will demand.

Why this matters for scientists and engineers

Unified data and shared models reduce duplicated effort and make cross-domain research faster. Standardized preprocessing and metadata improve reproducibility and downstream integration with lab workflows.

Cloud access lowers the barrier for multidisciplinary teams, including smaller labs that lack on-prem resources. If done well, the consortium could set common benchmarks, streamline model evaluation, and shorten the path from raw data to publishable results or deployable engineering outputs.

How to prepare a high-value response

Map your data assets: Inventory datasets, modalities, sizes, formats, and known data quality issues. Highlight unique or hard-to-recreate data.
Commit to FAIR: Propose metadata schemas, identifiers, and provenance that support findability and reuse. The FAIR principles are a useful baseline.
Define cleaning and preprocessing: Show how you will structure, normalize, and label data. Specify file formats (e.g., HDF5, NetCDF), ontologies, and versioning.
Data governance and security: Classify data; address export controls, controlled unclassified information, and privacy. Propose access tiers and audit mechanisms.
Licensing and IP: Recommend licenses for datasets and models that enable research use while respecting contributor rights.
Model development and evaluation: Outline training plans, baselines, domain-specific metrics, and reporting (e.g., model cards, documentation for reproducibility).
Compute and storage: Estimate training and inference needs; discuss cloud-HPC integration, data locality, and cost controls.
Interoperability: Plan APIs and formats that support multi-lab, multi-discipline use without rigid coupling.
Sustainability: Propose lifecycle plans for dataset refresh, model updates, and deprecation policies.
Consortium operations: Suggest governance, partner onboarding, publication policies, and incentives for data contribution.

Access and infrastructure

DOE intends to provide models through cloud programs to speed up experimentation and collaboration. Expect tight coordination with national lab HPC systems, data center expansions, and energy planning to support training and large-scale inference.

Standards and risk management

For research-grade deployments, tie your approach to recognized frameworks for documentation, safety, and assurance. NIST's guidance is a solid reference point for methodical risk treatment across the AI lifecycle.

NIST AI Risk Management Framework

What to do next

Assemble a cross-functional team (PI, data engineer, security lead, and program manager) to draft responses.
Prioritize one or two high-impact use cases with clear datasets and measurable outcomes.
Prepare short, testable pilots you can scale within the consortium.
Monitor DOE channels for submission deadlines and technical annexes tied to the RFI.

If your team needs structured upskilling to align with this work, browse role-specific AI curricula: AI courses by job.

Get Daily AI News

Your membership also unlocks:

700+ AI Courses

700+ Certifications

Personalized AI Learning Plan

6500+ AI Tools (no Ads)

Daily AI News by job industry (no Ads)

Advertisement

Energy Department plots national labs data consortium to fuel self-improving AI for science and engineering

DOE plans public-private data-curation consortium for science and engineering AI

What DOE is asking for

Policy backdrop and scope

Why this matters for scientists and engineers

How to prepare a high-value response

Access and infrastructure

Standards and risk management

What to do next

Related AI News for Science and Research

How AI Slipped Into Peer Review: Faster Publishing, Murky Transparency, Untapped Rigor

From Busywork to Breakthroughs: Building Reliable Scientific AI Agents with NeMo Gym and NeMo RL

AI tips off scientists to a new monkeypox weak spot, opening the door to simpler vaccines and antibody therapies

AI spots chronic stress on routine CT: adrenal volume index tracks cortisol and predicts heart failure risk

About Complete AI:

Latest AI News for your Job:

Courses by AI Skill:

Courses by Job Field:

Courses by AI Company:

AI Tools for your Job:

AI Tools by Type:

AI Certifications by Skill:

AI Certifications by Job Field:

AI Certifications by Company: