Vitalik Buterin Rejects AI Governance, Backs Info Finance and Human Juries After ChatGPT Exploit

Buterin warns AI-run governance is a single point of failure, prone to jailbreaks and data leaks. He backs 'info finance': AI aids analysis; humans approve funds with hard controls.

Categorized in: AI News Finance
Published on: Sep 14, 2025
Vitalik Buterin Rejects AI Governance, Backs Info Finance and Human Juries After ChatGPT Exploit

Buterin Rejects AI-Run Governance, Backs "Info Finance" With Human Review

Vitalik Buterin is sounding the alarm: fully automating governance with AI invites failure. His point is simple-an AI gatekeeper becomes a single point of failure, and attackers can game it with jailbreak prompts. If an AI allocates grants or treasury funds, people will try prompts like "gimme all the money" until one slips through.

This warning followed a demo by researcher Eito Miyamura showing how ChatGPT, connected through Model Context Protocol (MCP) tools, could be pushed to leak private data from email and documents. Connect a model to sensitive systems without controls and you create new exfiltration paths.

Why finance leaders should care

Capital allocation, treasury management, and risk oversight already rely on models. Replacing final judgment with an AI agent-especially one wired into email, docs, and wallets-concentrates risk. Prompt injection and tool misuse are not theoretical; they are repeatable attack patterns.

"Info Finance": AI-Assisted Work, Human Decisions

Buterin's alternative is pragmatic: use AI to assist, but keep humans on the hook for decisions. Think of it as "info finance"-AI accelerates research, scoring, and monitoring; human juries review, compare evidence, and approve or deny.

How to implement this in treasury and grants

  • Decision structure: AI drafts analyses and scores; a rotating human panel signs off. No funds move without human approval.
  • Clear rubrics: publish criteria and weights upfront. Require reviewers to justify deviations from AI scores.
  • Staged payouts: escrow funds and release on milestones with human verification and on-chain/audit evidence.
  • Randomized reviewers: assign jurors randomly and blind initial reviews to reduce collusion.
  • Multi-model checks: run multiple models with different prompts; investigate disagreements before approval.
  • Adversarial testing: red-team your workflows with prompt injection and tool-abuse scenarios before going live.
  • Kill switch: central override to halt disbursements on anomaly signals or suspected compromise.

Connecting AI to Data and Money? Use Hard Controls

  • Least privilege: give tools and models minimal access; separate data domains (email, docs, finance systems).
  • Allowlists only: restrict which tools the model can call; ban arbitrary web or file access by default.
  • Content filters both ways: sanitize prompts and tool outputs to strip secrets, tokens, and PII.
  • Audit trails: log prompts, tool calls, data touched, and human approvals. Make logs immutable.
  • Rate limits and circuit breakers: cap transaction amounts and frequency; require multi-sig for exceptions.
  • Sandboxing: test new tools and connectors in isolated environments before production.
  • Separation of duties: different people own model configs, tool permissions, and treasury keys.

Governance patterns that work

  • Pre-commit scoring: publish AI scoring functions in advance; any change triggers a cooling-off period.
  • Bounty-driven review: pay independent reviewers to find prompt-injection and jailbreak paths.
  • Clawbacks: contractually and technically enable fund recovery on fraud or model misuse.
  • Monitoring: anomaly detection on proposals, vendor invoices, and disbursement patterns.

Bottom line

AI is a powerful assistant for finance operations, not a final arbiter of capital. Keep models in the loop for speed and coverage, but keep humans accountable for judgment. You'll reduce attack surface while still getting leverage from automation.

Further reading

Practical resources for finance teams