Mimicking the brain can improve AI performance
Researchers at the University of Surrey report a simple idea with big upside: wire artificial networks more like the human brain. In a study published in Neurocomputing, they show that limiting connections to nearby or related neurons can boost performance while reducing energy use across generative systems and large models such as ChatGPT.
They call the approach Topographical Sparse Mapping (TSM). The premise is straightforward: keep connections local and meaningful, cut the rest.
"Our work shows that intelligent systems can be built far more efficiently, cutting energy demands without sacrificing performance," said Dr Roman Bauer, senior lecturer. The team found that eliminating vast numbers of unnecessary connections improves efficiency without a hit to accuracy.
Energy is the pressure point. "Training many of today's popular large AI models can consume over a million kilowatt-hours of electricity. That simply isn't sustainable at the rate AI continues to grow," Bauer added.
What is Topographical Sparse Mapping?
TSM connects each neuron mainly to nearby or semantically related neurons. This mirrors how the brain organizes information: locality first, everything else if needed.
The result is a network with far fewer long-range links, lower compute, and better memory locality-useful for both training and inference. For teams running large generative systems, this means less overhead with similar outcomes.
Enhanced TSM: pruning while you train
An extended variant-Enhanced Topographical Sparse Mapping-adds a biologically inspired pruning process during training. As the model learns, weak or redundant connections are removed, similar to how the brain refines synapses over time.
The benefit is compounding: fewer parameters to update, tighter activations, and improved throughput on sparse-friendly hardware.
How to try this approach in practice
- Impose locality: restrict each unit to K-nearest connections by spatial position (CNNs/VAEs) or feature similarity (embeddings/MLPs).
- Structured sparsity: use block- or pattern-sparse matrices so libraries can apply efficient sparse kernels.
- Prune during training: start dense in local neighborhoods, prune low-magnitude or low-saliency edges on a schedule, then briefly fine-tune.
- Measure energy, not just accuracy: track FLOPs, activation sparsity, memory movement, and kWh alongside validation metrics.
- Target deployment early: design for on-device or data-center inference with sparse execution paths and cache-friendly layouts.
- Test on generative workloads: diffusion, transformers, and autoregressive models benefit from locality constraints in attention and feedforward layers.
Where this could matter
Generative pipelines, large language models, and multimodal systems stand to gain the most. Training budgets fall, inference latency improves, and memory footprints shrink-useful for both cloud-scale and edge deployments.
The team is also exploring neuromorphic applications, where brain-inspired wiring maps naturally to event-driven computation. You can follow related updates from the university at University of Surrey.
Bottom line
Make networks local first, prune as you learn, and keep the useful connections. You'll cut energy and cost without giving up accuracy.
If you're planning experiments or training for your team, these curated resources can help you move faster: Latest AI courses.
Your membership also unlocks: