Researchers Assess AI Trustworthiness Through Sudoku Challenges
Large language models (LLMs) like OpenAI's ChatGPT and Google's Gemini have demonstrated impressive capabilities, from offering advice to generating complex text. But how reliable are they at tasks demanding logical reasoning, such as solving sudoku puzzles?
A team of computer scientists at the University of Colorado Boulder tackled this question by creating nearly 2,300 original sudoku puzzles. They tested various AI models on these puzzles, aiming to evaluate both their problem-solving accuracy and their ability to explain their reasoning.
Mixed Success in Solving and Explaining Sudoku
While some AI models managed to solve easy sudoku puzzles, even the top-performing ones struggled to provide clear, correct explanations of their solutions. The explanations were often confusing, inaccurate, or off-topic. This raises concerns about trusting AI-generated outputs, especially when clarity and transparency are critical.
Maria Pacheco, assistant professor in the Department of Computer Science, highlighted the issue: “Most LLMs still fall short, particularly in producing explanations that are usable for humans. They don’t clearly articulate the steps taken to reach a solution.”
Why Sudoku as a Test Case?
Sudoku puzzles require applying explicit logical rules — for example, not placing the same number twice in a row or column. This kind of reasoning is challenging for LLMs because they primarily generate responses by predicting likely word sequences based on vast text data, rather than through rule-based logic.
OpenAI’s ChatGPT, for instance, was trained on extensive internet text, focusing on predicting the next word in a sequence rather than solving problems through structured reasoning. This can result in plausible-sounding but incorrect explanations.
Bridging Logic and Language Models
The research team is part of a broader effort to combine the pattern-recognition strengths of LLMs with explicit logical reasoning — a field known as neurosymbolic AI. The goal is to create AI systems capable of both accurate problem-solving and transparent, human-understandable explanations.
Testing with a Simpler Sudoku Variant
The study used six-by-six sudoku grids, a smaller and simpler format than the standard nine-by-nine puzzles, to evaluate AI performance. Among the models tested, OpenAI's o1 model (state-of-the-art in 2023) solved about 65% of puzzles correctly.
However, when asked to explain their solutions, the AI models often faltered. Ashutosh Trivedi, associate professor of computer science, noted that some explanations included fabricated facts, such as incorrectly stating the presence of a number in a row.
In one striking instance, an AI model responded to a sudoku explanation prompt with a weather forecast, revealing confusion and a breakdown in logical consistency.
Looking Ahead: More Reliable AI Problem Solvers
The researchers aim to develop AI systems that can both solve complex puzzles and explain their reasoning clearly. Their next focus is a different logic puzzle called hitori, which, like sudoku, involves working within a grid of numbers.
Fabio Somenzi, professor in Electrical, Computer, and Energy Engineering, emphasized the broader implications: “If AI prepares your taxes, you need to explain its decisions to the IRS. Puzzles provide a manageable way to study and improve AI decision-making.”
These findings highlight the need for caution in relying on AI-generated information without explainability. Progress in neurosymbolic approaches may lead to AI that can reason more like humans and communicate its logic effectively.
Further Reading
- Explore AI courses on reasoning and explainability
- Learn about prompt engineering techniques for better AI outputs
Reference: Anirudh Maiya et al, Explaining Puzzle Solutions in Natural Language: An Exploratory Study on 6x6 Sudoku (2025), Findings of the Association for Computational Linguistics.
Your membership also unlocks: