2,400-year-old math riddle challenges ChatGPT, revealing improvisation and limits

Researchers used Plato's square-doubling puzzle to test if ChatGPT reasons or recalls. A rectangle variant exposed a confident error, urging guided prompts and proof checks.

Categorized in: AI News Science and Research

Published on: Sep 28, 2025

Can ChatGPT "learn" like a human? A 2,400-year-old math test offers clues

Plato recorded Socrates testing a student with the "doubling the square" problem: how to make a square with twice the area. The trick is non-obvious unless you realize each side of the new square equals the original square's diagonal, introducing a scale factor of √2.

Researchers from the University of Cambridge and the Hebrew University of Jerusalem used this classic puzzle to probe a modern question: does an LLM derive new mathematical reasoning, or simply retrieve patterns from text? Because models like ChatGPT are trained primarily on language, they chose a geometry problem whose exact solution is unlikely to appear verbatim in training data.

The twist came when they extended the task. In a study published Sept. 17 in the International Journal of Mathematical Education in Science and Technology, the team asked ChatGPT to double the area of a rectangle using similar reasoning. The model incorrectly asserted there was no geometric solution via the rectangle's diagonal, even though one exists, suggesting it improvised from prior discussion of the square rather than recalling a known method.

"When we face a new problem, our instinct is often to try things out based on our past experience," said visiting scholar Nadav Marco. "In our experiment, ChatGPT seemed to do something similar. Like a learner or scholar, it appeared to come up with its own hypotheses and solutions."

The authors connect this behavior to the zone of proximal development (ZPD), the gap between what's currently known and what can be reached with guidance. With the right prompts, the model appeared to push into that zone-sometimes productively, sometimes by making confident mistakes.

This is a reminder that AI reasoning remains a black box and its "proofs" are not guaranteed. As professor Andreas Stylianides noted, students and researchers must evaluate AI-generated arguments rather than accept them at face value.

Practical takeaways for scientists, mathematicians, and educators

Treat AI outputs as hypotheses. Require definitions, lemmas, and a checkable proof outline. Verify steps independently.
Prompt for process over answers: "Let's explore this together. Propose two distinct approaches, state assumptions, and test for counterexamples."
Cross-modality helps. Pair LLMs with dynamic geometry systems or formal theorem provers to validate constructions and claims.
Probe generalization. After a solution, modify constraints (e.g., from square to rectangle) and ask the model to reconcile differences.
Demand citations or source attributions when the model claims known results. Flag anything uncited for manual review.

Why the rectangle example matters

Doubling area is a scaling problem: multiply linear dimensions by √2. There are geometric constructions to achieve this for rectangles, too. The model's confident refusal signals analogy-driven improvisation rather than grounded retrieval, which is useful for research-if and only if you keep verification in the loop.

Research directions

Benchmark newer models on out-of-distribution geometry tasks and compare against curated training corpora to separate retrieval from synthesis.
Integrate LLMs with theorem provers and dynamic geometry tools to produce verifiable, constructible outputs in STEM settings.
Develop classroom protocols for evaluating AI-generated proofs as a core skill in mathematics education.

Study reference: International Journal of Mathematical Education in Science and Technology (journal homepage). For background on the zone of proximal development, see an overview (external reference).

Further learning

If you're refining prompts for research or teaching, explore concise prompt strategies and classroom workflows curated for practical use (prompt engineering resources).

Get Daily AI News

Your membership also unlocks:

700+ AI Courses

700+ Certifications

Personalized AI Learning Plan

6500+ AI Tools (no Ads)

Daily AI News by job industry (no Ads)

2,400-year-old math riddle challenges ChatGPT, revealing improvisation and limits

Can ChatGPT "learn" like a human? A 2,400-year-old math test offers clues

Practical takeaways for scientists, mathematicians, and educators

Why the rectangle example matters

Research directions

Further learning

Related AI News for Science and Research

AI is changing open science-time to update the rules

Discovery and Lux: DOE's new AI supercomputers accelerate U.S. science and security at ORNL, with Lux in 2026 and Discovery in 2028

US-AMD $1B supercomputer pact: why it's big for AI, fusion energy, cancer research, and national security

From Panic to Proof: Studying AI Chatbots' Effects on Teen Mental Health

About Complete AI:

Latest AI News for your Job:

Courses by AI Skill:

Courses by Job Field:

Courses by AI Company:

AI Tools for your Job:

AI Tools by Type:

AI Certifications by Skill:

AI Certifications by Job Field:

AI Certifications by Company: