AI helps explain covert attention - and points to new neuron types
Shifting attention without moving your eyes feels automatic. You glance at the road, but your mind "checks" the side mirror. You read a room without breaking eye contact. That behavior - covert attention - now has a clearer mechanistic picture thanks to convolutional neural networks (CNNs) and a team at UC Santa Barbara: Sudhanshu Srivastava, Miguel Eckstein, and William Wang.
By training CNNs on classic visual detection tasks and then inspecting every unit, they found that attention-like behaviors can emerge without a dedicated attention module. More striking, the networks expressed neuron types that map to recordings in mouse brains - and predicted types researchers hadn't highlighted before. Their study appears in the Proceedings of the National Academy of Sciences.
Why this matters for scientists and research teams
The result reframes attention as a property that can emerge from distributed computation optimized for detection - not a switch flipped by a single "attention center." For experimenters, it suggests new cell classes and interactions to look for, beyond standard excitatory effects. For modelers, it validates using straightforward CNNs as hypothesis engines for systems neuroscience.
What the team did
- Trained 10 CNNs (about 180k units each; ~1.8M artificial "neurons" total) on target detection tasks modeled after cueing experiments.
- Used Posner cueing, where an arrow or box briefly precedes a target, to test accuracy and speed benefits when cues match target locations.
- Included no built-in attention mechanism; any attentional signatures had to arise from learning the task.
- Characterized every unit's response to cue and target - something impractical in vivo at this scale.
Key findings
- Emergent covert attention: CNNs reproduced behavioral hallmarks of covert attention without an explicit orienting module (echoing the group's 2024 result with 200k-1M unit networks).
- Known-like neuron responses: Some units mirrored patterns reported in primate and mouse studies, despite no attention circuitry "hard-coded."
- New neuron types:
- Cue-inhibitory: Responses decreased in the presence of the cue.
- Location-opponent: Excited when cue and target appear at one location while suppressing activity at other locations (a push-pull pattern).
- Location-summation: Responses summed across locations under certain cue-target arrangements.
- Cue-opponent + target-summation: Present in CNNs but not observed in mouse data - a clue to biological constraints the AI doesn't share.
Location opponency is common elsewhere in vision (e.g., color-opponent cells); seeing a related motif in attention points to a broader computational theme: amplify what's likely relevant, tamp down the rest.
Biological confirmation (and limits)
To test correspondence, the team examined mouse neural data from cueing tasks. They found location-opponent cells - and other unreported types like cue-inhibitory and location-summation - in the superior colliculus. That's notable: covert attention for decades was associated mainly with primates and parietal cortex. The midbrain now enters the picture with concrete signatures to probe.
One CNN-derived type didn't appear in mice. That mismatch is productive. It points to anatomical or wiring limits in biology and offers a direct target for future recordings in other species or regions.
How to apply this in your lab or team
- Broaden your attention metrics: Don't stop at firing-rate increases. Track suppression and opponency across locations, especially under cue congruency.
- Interrogate the superior colliculus: During cueing tasks, look for push-pull patterns tied to expected vs. unexpected locations.
- Use simple CNNs as hypothesis generators: Train task-optimized models without attention modules; then characterize unit responses to cue-target pairs.
- Design perturbations that test opponency: Manipulate cue validity and spatial uncertainty to separate summation, inhibition, and opponent dynamics.
- Cross-species checks: Given evidence in fish, mice, and bees, test whether opponent-like attention motifs generalize across simpler circuits.
Context that shifts the narrative
Covert attention has long been framed as a near-instant capability tied to primate parietal cortex and, sometimes, to consciousness. Yet similar behaviors appear in species with simpler brains. If attention improves detection, evolution may arrive at comparable strategies - excite what matters here, suppress what doesn't there - using different substrates.
As Eckstein put it, the study avoids treating CNNs as black boxes: "In the CNN, we could characterize every single unit, and that might guide our understanding of the real neurons in brains." Srivastava was blunt about the impact: "It fundamentally changed how we think about attention."
Who's behind the work
- Sudhanshu Srivastava: Co-led the modeling and analysis underpinning the emergent attention findings.
- Miguel Eckstein: Professor of Psychological & Brain Sciences at UCSB; senior author focused on visual attention and human performance.
- William Wang: Founding Director of UCSB's Center for Responsible Machine Learning; works across ML for speech, vision, and information extraction.
Together they also lead UCSB's Mind & Machine Intelligence Initiative, bridging AI and cognitive science.
What to watch next
- Human recordings: Do location-opponent attention signals show up in human EEG/MEG patterns or intracranial data?
- Circuit mechanisms: What inhibitory and excitatory motifs implement opponency in superior colliculus and cortex?
- Model constraints: Which biological priors prune out CNN-only neuron types, and which priors are essential for matching data?
"This is a clear case of AI advancing neuroscience, cognitive sciences and psychology." The takeaway for researchers is practical: pair simple, task-optimized models with targeted physiology to surface mechanisms that classic analyses miss.
Resources
- Proceedings of the National Academy of Sciences for the published study.
- Posner cueing paradigm overview for task background.
Further skill-building
If your team prototypes CNNs for hypothesis generation or needs structured upskilling in model analysis, explore curated training for researchers at Complete AI Training: Courses by Job.
Your membership also unlocks: