Plato's Puzzle Shows ChatGPT Learning With Guidance, Not Just Recall
ChatGPT-4 solved Plato's square with algebra, resisted wrong hints, then-with prompts-gave the classic construction, suggesting a 'Chat's ZPD.' Small study, but useful for teaching.

AI That Acts Like a Learner? Lessons from Plato's Square
Researchers found ChatGPT-4 solved Plato's square-doubling puzzle with algebra before offering the classical geometric construction. It resisted wrong suggestions, explained why shortcuts failed, and improved with guidance. The authors describe this as a "Chat's Zone of Proximal Development" - tasks solvable with timely prompts. The study is exploratory and based on one conversation, but it raises practical questions for education and research.
What the Researchers Did
Nadav Marco and Andreas J. Stylianides revisited the "Meno" puzzle: double the area of a square. Instead of citing the well-known diagonal construction, ChatGPT first produced an algebraic solution - a method invented centuries after Plato. When pushed toward the classic mistake (doubling side length), the model refused and explained that area would quadruple, not double.
The team then shifted to rectangles. ChatGPT clarified that the square's diagonal trick does not transfer: "the diagonal does not offer a straightforward new dimension" for rectangles. This showed sensitivity to problem structure, a common stumbling block for learners.
How It Handled Variations
Under prompts for "elegant and exact," ChatGPT provided the geometric construction it had skipped earlier and acknowledged it should have emphasized that first. The model's meta-comments about its own process, however, were inconsistent. The researchers caution against reading these reflections as a window into actual mechanisms.
The "Chat's ZPD": Guidance Extends Capability
Borrowing from Vygotsky's Zone of Proximal Development, the authors saw problems ChatGPT could not tackle alone but solved with well-timed hints. Some outputs looked like retrieval; others looked like stepwise problem solving and resistance to error. Under the right prompts, the system adapted to new constraints and self-corrected, much like a student refining a method.
Why This Matters for Education and Research
If AI can act like a learner, it can serve as a partner in exploration. Prompt type shapes behavior: requests for collaboration and reasoning yielded different outputs than requests for sourced summaries. This enables teachers, tutors, and researchers to externalize strategies, surface misconceptions, and test transfer across related tasks.
Practical Prompt Patterns You Can Use
- Reasoning-first: "Think step by step. Explain your approach before any equations. Then give a final answer."
- Error probing: "Here's a common mistake: doubling the side doubles the area. Is this correct? Why or why not?"
- Transfer test: "You solved the square. Does the same idea work for a rectangle? If not, propose an alternative."
- Minimal hints: "Offer one hint that reduces uncertainty without giving the answer. Then pause."
- Self-check: "Re-evaluate your solution. Offer a simpler or more exact method if one exists."
- Constraint shift: "Solve without geometry. Now solve using a geometric construction only."
- Reflection for teaching: "Explain what misconception this solution guards against and how to surface it in class."
Sample Prompts for Class or Lab
- "Model a Socratic dialogue that leads a student to double a square's area without stating the answer upfront."
- "Generate a pair of counterexamples that would reveal the misconception that doubling a side doubles area."
- "Propose a short assessment to test whether a student can transfer the square method to rectangles, and justify each item."
- "Given this failed approach, suggest the smallest hint that would redirect thinking without giving away the method."
Limitations and Open Questions
- Single conversation with one model (ChatGPT-4, February 2024); results may differ across versions or systems.
- No predefined analytical framework; interpretations distinguish "recollection" vs. "generation" post hoc.
- Model self-explanations conflicted, so they should not be treated as process evidence.
Open questions: How stable is the "Chat's ZPD" across tasks and models? What prompt scaffolds produce reliable gains? Can we quantify prompt difficulty and measure transfer in controlled studies?
Implications for Study Design
- Protocol: Fix prompt templates, vary hint timing, and pre-register criteria for "assisted" vs. "independent" success.
- Metrics: Track error resistance, method shifts under constraints, and cross-task transfer.
- Controls: Compare against retrieval-only settings (e.g., "cite sources") and against human think-aloud baselines.
- Ethics: Avoid treating model self-talk as ground truth about internal mechanisms.
Citation
Marco, N. & Stylianides, A. J. (2025). "An exploration into the nature of ChatGPT's mathematical knowledge." International Journal of Mathematical Education in Science and Technology. DOI: 10.1080/0020739X.2025.2543817
Further Reading
Prompts and Training
If you want to develop stronger prompting strategies for classroom or lab settings, see our prompt-engineering resources: Prompt Engineering.