How LLMs Do Theory of Mind with Tiny Sparse Circuits-and Why Rotary Positional Encoding Could Cut Energy Costs

LLMs do Theory of Mind with tiny parameter clusters yet fire up the whole model-wasting compute. RoPE guides how beliefs are tracked, hinting at lean, selective routing.

Categorized in: AI News Science and Research

Published on: Nov 12, 2025

Sparse Circuits, Big Insight: How LLMs Do Theory of Mind (and Why It's Inefficient)

Researchers at Stevens Institute of Technology found that large language models perform Theory-of-Mind (ToM) reasoning using a small, specialized subset of parameters-while still activating the entire network every time. That mismatch is the headline: selective internal circuitry doing the work, wrapped inside a compute-hungry process.

The team also shows that positional encoding-especially rotary positional encoding (RoPE)-is central to how models track beliefs and perspectives. In other words, the way a model encodes word positions quietly steers its social reasoning.

A quick mental model

Think of the classic false-belief setup: someone hides a chocolate bar in a box, then another person moves it to a drawer. You know the bar is in the drawer. You also know the first person will look in the box. Humans do this in seconds, using only a small slice of neural resources.

LLMs can do something similar, but they light up almost their entire network to produce the answer-whether the prompt is trivial or complex. That's a serious efficiency gap.

Key findings

Sparse circuits: ToM relies on tiny clusters of parameters. Perturbing as little as 0.001% of these ToM-sensitive parameters causes a measurable drop in ToM performance and harms contextual processing.
Crucial encoding: RoPE strongly influences how models represent beliefs and perspectives. Changes here alter attention geometry (e.g., the angle between queries and keys) and disrupt dominant frequency channels tied to context.
Efficiency gap: Humans recruit a small neural subset for social reasoning; LLMs light up almost everything. Understanding these sparse circuits points the way to selective, energy-efficient computation.

Why this matters for your roadmap

If you build or evaluate LLM systems, this is a blueprint for lowering inference cost without sacrificing capability. The study suggests future models can toggle relevant parameter subsets on demand-similar to how the brain recruits specialized regions for a task.

That means fewer wasted FLOPs, smaller energy bills, and better latency for workloads that include social reasoning, dialogue safety, and multi-agent simulations.

Under the hood: what's actually happening

Parameter sparsity with global activation: Only a small internal cluster is critical for ToM, yet the full network still runs. This is avoidable overhead.
RoPE as a control dial: The positional encoding routine-specifically RoPE-modulates angles between queries and keys and emphasizes frequency bands that help the model localize beliefs across context.
Fragility reveals function: Micro-perturbations to ToM-sensitive parameters degrade ToM and general language localization, indicating these parameters sit at a structural choke point for social inference.

What to build next

Conditional computation: Introduce routing that activates only task-relevant experts or blocks for ToM-like workloads. Pair with confidence gating to keep quality stable.
Position-aware gating: Use RoPE-driven signals to trigger selective activation. If ToM cues are detected (e.g., belief states, perspective shifts), route to sparse ToM modules.
Targeted pruning and quantization: Preserve ToM-sensitive clusters while trimming less relevant paths. Validate with ToM benchmarks.
Instrumentation: Track attention geometry and frequency activations tied to ToM during training. Use controlled ablations to verify causal pathways.

For teams in science and research

ML engineers: Add ToM probes to your eval suite; log query-key angle shifts under RoPE. Test micro-ablations to map sensitive parameter sets.
Product leads: Expect inference savings from conditional compute. Prioritize features that help the model "use less to do more."
Neuroscience collaborators: This is a clean bridge to brain-inspired selective activation. Plan joint studies around belief tracking and perspective-taking.

Key questions answered

What did researchers discover about AI social reasoning?
LLMs rely on a small, specialized set of internal connections and positional encoding patterns to perform Theory-of-Mind reasoning.
Why does this matter for AI efficiency?
Current models activate most parameters for every task; mapping sparse ToM circuits enables selective activation and lower energy use.
What's next for LLM design?
Build models that activate only task-specific parameters-more like the brain-to cut compute and improve throughput.

Source and research

Source: Stevens Institute of Technology

Original Research: "How large language models encode theory-of-mind: a study on sparse parameter patterns," published in npj Artificial Intelligence.

Learn more

If you're upskilling your team on LLM systems, interpretability, or efficient inference, explore practical training paths here: Latest AI courses.

Get Daily AI News

Your membership also unlocks:

700+ AI Courses

700+ Certifications

Personalized AI Learning Plan

6500+ AI Tools (no Ads)

Daily AI News by job industry (no Ads)

Advertisement

How LLMs Do Theory of Mind with Tiny Sparse Circuits-and Why Rotary Positional Encoding Could Cut Energy Costs

Sparse Circuits, Big Insight: How LLMs Do Theory of Mind (and Why It's Inefficient)

A quick mental model

Key findings

Why this matters for your roadmap

Under the hood: what's actually happening

What to build next

For teams in science and research

Key questions answered

Source and research

Learn more

Related AI News for Science and Research

How AI Slipped Into Peer Review: Faster Publishing, Murky Transparency, Untapped Rigor

From Busywork to Breakthroughs: Building Reliable Scientific AI Agents with NeMo Gym and NeMo RL

AI tips off scientists to a new monkeypox weak spot, opening the door to simpler vaccines and antibody therapies

AI spots chronic stress on routine CT: adrenal volume index tracks cortisol and predicts heart failure risk

About Complete AI:

Latest AI News for your Job:

Courses by AI Skill:

Courses by Job Field:

Courses by AI Company:

AI Tools for your Job:

AI Tools by Type:

AI Certifications by Skill:

AI Certifications by Job Field:

AI Certifications by Company: