Gallagher Re warns current AI evaluation methods are not fit for insurance underwriting

Gallagher Re warned June 12, 2026, that current AI benchmarks fail to measure underwriting risks. The firm urges failure-focused testing to prevent inflated premiums.

Categorized in: AI News Insurance

Published on: Jun 13, 2026

Gallagher Re warned on 12 June 2026 that current artificial intelligence evaluation methods are unfit for underwriting, requiring a shift toward failure-focused testing to price AI risk accurately. Without this change, insurers risk pricing uncertainty rather than actual risk, which could inflate premiums and stall market development.

The limits of current benchmarks

In a new report titled "Anthropic's Fourth Way: Why Restricted AI Models Are a Challenge for Insurers," the global reinsurance firm argues that standard benchmarks focus on capability rather than failure. These standardized tests measure performance under controlled conditions, leaving blind spots for ambiguous, real-world inputs. A model scoring highly on fixed tasks can still hallucinate or make inconsistent decisions in deployment.

Ed Pocock, global head of cyber security at Gallagher Re, emphasized the gap between test scores and underwriting needs. "They indicate what a model can do under controlled, but insurers are concerned with how models fail, how often they fail, and whether those failures could be correlated across a portfolio," Pocock said. This evaluation gap directly affects any insurer weighing AI exposure, including captives considering how to underwrite or retain risks from internal AI deployment.

The threat of concentration risk

The report highlights benchmark contamination as a growing problem, where models are increasingly shaped by the very tests used to evaluate them. This dynamic inflates published scores and reduces their value as a guide to real-world reliability. Furthermore, efforts to reduce failure rates and boost test performance can increase model homogeneity. "This risks erasing useful differentiation between systems and increasing concentration risk," Pocock said.

Concentration risk becomes acute when widely shared foundation models fail. If multiple insureds rely on the same underlying technology, a single flaw could trigger correlated losses across an entire portfolio. The reinsurance market can actively influence which models are deployed through underwriting requirements, pricing signals, and coverage design. Professionals seeking to understand these dynamics can explore broader trends in AI for Insurance to see how underwriting standards adapt to new technologies.

The challenge of restricted models

Gallagher Re also identified restricted-distribution AI as a new, fourth category of frontier model, joining open source, open weight, and proprietary systems. The firm pointed to Anthropic's Mythos model, released under Project Glasswing, which is available only to a vetted group of partners. While the UK AI Security Institute has analyzed Mythos, Gallagher Re argues that insurers need access to independent, third-party evaluations to price risk accurately.

"If a model cannot be independently evaluated, it cannot be meaningfully priced," Pocock said. "Insurers could end up loading for uncertainty rather than reflecting actual risk. That raises costs for everyone and slows the market's development." The firm calls for evaluation methods that test AI systems as they operate, using real-world inputs under adversarial conditions over time. Organizations tracking these shifts in risk management should monitor developments in AI for Finance, where similar demands for transparent, auditable systems are driving market standards.

Why this matters for insurance professionals

Underwriters and risk managers must demand evaluation metrics that measure hallucination rates, decision consistency, and correlated failure potential. Relying on vendor-provided benchmark scores leaves portfolios exposed to hidden, systemic vulnerabilities. As Pocock noted, "Better evaluation gives the market the tools to reward transparency and robustness. Without it, we risk defaulting to scale and brand as proxies for safety, which could amplify the concentration risks we'll need to manage."

Get Daily AI News

Your membership also unlocks:

700+ AI Courses

700+ Certifications

Personalized AI Learning Plan

6500+ AI Tools (no Ads)

Daily AI News by job industry (no Ads)

Gallagher Re warns current AI evaluation methods are not fit for insurance underwriting

The limits of current benchmarks

The threat of concentration risk

The challenge of restricted models

Why this matters for insurance professionals

Related AI News for Insurance

London cyber insurance market stays soft as AI outpaces governance and exclusion wordings

UnitedHealth Group reports higher second-quarter earnings and expands artificial intelligence use across operations

Aon warns businesses lag on AI cyber risk as London insurance market stays soft

South Korea FSC announces AI insurance fraud prevention infrastructure

About Complete AI:

Latest AI News for your Job:

Courses by AI Skill:

Courses by Job Field:

Courses by AI Company:

AI Tools for your Job:

AI Tools by Type:

AI Certifications by Skill:

AI Certifications by Job Field:

AI Certifications by Company: