IBM, Nvidia and Red Hat launch open standard for AI-native document format

IBM, Nvidia, and Red Hat are backing DocLang, a new open standard for AI-readable documents under the Linux Foundation. It aims to replace PDFs and Word files in AI pipelines with a structured format that cuts preprocessing costs and complexity.

Categorized in: AI News IT and Development
Published on: Jun 10, 2026
IBM, Nvidia and Red Hat launch open standard for AI-native document format

IBM, Nvidia, and Red Hat Back New AI-Native Document Format

A working group hosted by the Linux Foundation is developing DocLang, an open specification for documents designed specifically for AI systems rather than human readers. IBM, Nvidia, and Red Hat founded the effort, with ABBYY and Human Signal contributing to development.

The specification addresses a practical problem: current document formats like PDFs and Word files were built for human consumption, forcing AI systems to waste computational effort extracting meaning. DocLang defines a structured, machine-readable format similar to how JSON standardizes data, allowing any tool to implement it and any pipeline to consume it.

The working group builds on DocLing, an existing toolkit that converts human-readable documents into structured data. DocLang extends that work into a vendor-neutral standard for enterprise use.

Why This Matters for IT Teams

Organizations increasingly rely on generative AI and agentic systems to process business documents. The current fragmented approach-handling PDFs, JPEGs, spreadsheets, and other formats-introduces complexity, raises costs, and reduces reliability when extracting information at scale.

A standard format would let teams automate document preprocessing. When a user uploads a document to an AI agent, a preprocessing skill could convert it to DocLang format automatically, reducing token consumption and improving efficiency.

The approach also supports exporting AI-generated outputs-visualizations or structured data-back to formats humans can use outside AI tools.

Standards Need to Evolve

Existing document standards served their purpose for decades but weren't designed for AI workflows. Carmi Levy, an independent technology analyst, said documents in the AI era are more iterative and dynamic than static file formats allow.

"DocLang represents an early, best hope of achieving some kind of foundational baseline for document standards, one that will hopefully allow more intelligent, more efficient, lower-risk workflows than is currently the case," Levy said.

Taking an open-source, vendor-agnostic approach mirrors how earlier standards-for networking, documentation, the web, and cloud computing-enabled broad digital collaboration rather than locking users into proprietary systems.

Governance Questions Remain

Jason Andersen, principal analyst at Moor Insights & Strategy, supports automated preprocessing but warns the standard must preserve user choice. "These standards need to preserve the fact that humans can still do what they want, and do not need to know any coding to be proficient," he said.

Yaz Palanichamy, senior research analyst at Info-Tech Research Group, flagged a different concern: organizations adopting DocLang will need to implement and review controls to scale its use securely and maintain accountability. The governance framework around how documents flow through AI systems remains undefined.

The specification is still in development, with the working group accepting additional contributors. For AI for IT & Development professionals, understanding DocLang's role in document processing pipelines will become relevant as adoption grows in enterprise environments.


Get Daily AI News

Your membership also unlocks:

700+ AI Courses
700+ Certifications
Personalized AI Learning Plan
6500+ AI Tools (no Ads)
Daily AI News by job industry (no Ads)