How Children Learn Language May Explain AI Model Behavior
Research from the University of the Witwatersrand shows that the way children absorb language mirrors how artificial intelligence language models process and structure information across generations of learning.
The study, published in the Proceedings of the National Academy of Sciences, examined "iterated learning"-the theory that language evolves as each generation of learners absorbs, adapts, and passes it forward. The researchers found that this process naturally makes language more structured because simpler patterns survive while disorganized elements disappear.
Dr Devon Jarvis, lead author and Lecturer in the School of Computer Science and Applied Mathematics at Wits, built computational models with learning characteristics similar to children's brains. He then fed them data with properties found in human language and observed how successive generations learned from it.
"Computer brains find the structure in the data in the same way that children favor certain properties of language in learning," Jarvis said. "The dataset becomes more structured over generations because it makes learning easier."
How Children Learn in Stages
Children learn progressively, moving from basic categories to more complex distinctions. A child might first learn that plants and animals are different, then distinguish between animal types, before refining that understanding further.
Over-generalization drives this refinement. A child learns that birds have wings and fly, then discovers penguins cannot fly and can swim. These "mistakes" aren't random-they reflect how learners extract patterns and apply them broadly.
When children pass language to their own children, transmission errors occur. But Jarvis found these errors follow predictable patterns: easy language elements get retained and reused, while unstructured portions are forgotten. Communication pressure reveals the depth of individual learning capability.
Network Depth Matters
The researchers used deep linear neural networks-mathematical models mimicking how brains process information-to test their hypothesis. They discovered iterated learning worked best when networks had sufficient depth and multiple processing layers, combined with sufficiently complex language.
Shallow networks with fewer layers could not capture the structured regularities that make language learnable. This finding has direct relevance to generative AI and LLM systems, which depend on scale and layered processing for their capabilities.
Implications for AI Research
The design of any learning system-whether biological or artificial-and the richness of its environment determine how language structure is absorbed and transmitted. Jarvis noted that while deep linear networks and iterated learning theory have existed separately for years, their combination reveals something fundamental.
"Language evolves to become learnable based on the very specific nature of how children learn in stages and favor reusing information over learning new things," he said.
The findings suggest that principles underlying child development may explain how large-scale AI models behave. For educators and professionals studying AI research, this work demonstrates how insights from cognitive science and linguistics inform modern AI design.
The research was co-authored by Professor Richard Klein, Head of the School of Computer Science and Applied Mathematics; Professor Benjamin Rosman, Director of the Wits Machine Intelligence and Neural Discovery Institute; and Professor Andrew Saxe of University College London.
Your membership also unlocks: