AI Chatbots in University Law Classes: Promises, Pitfalls, and Persistent Errors

AI chatbots like SmartTest show promise in law education but often give inaccurate feedback, causing confusion. Students value quick responses but still prefer human tutors for trust and clarity.

Categorized in: AI News Education Legal
Published on: Jun 01, 2025
AI Chatbots in University Law Classes: Promises, Pitfalls, and Persistent Errors

AI Chatbots in University Law Classes: What Happens When They Fall Short?

Artificial intelligence tutors often promise to transform education by offering personalized, immediate feedback and guiding students through problems step-by-step. The idea is appealing—AI adapting to teaching styles, encouraging critical thinking without giving away answers, and identifying individual learning gaps. Yet, there’s limited evidence on how well AI performs in structured university courses, especially complex ones like law.

The Experiment with SmartTest

In 2023, a chatbot named SmartTest was developed and tested in a criminal law course at the University of Wollongong. Unlike general-purpose chatbots, SmartTest was built specifically for educators to embed questions, model answers, and prompts. It was programmed to use the Socratic method, pushing students to think critically rather than just handing out answers.

Over five testing cycles, an average of 35 students voluntarily interacted with SmartTest during tutorials. The first three cycles involved short hypothetical scenarios, such as determining guilt in a theft case. The final two cycles shifted to straightforward short-answer questions, like the maximum sentencing discount for a guilty plea. Conversations were recorded, and students were surveyed after the trials.

Key Findings: Promising but Flawed

SmartTest showed potential in highlighting gaps in students’ knowledge. However, the early cycles with scenario-based questions revealed a high rate of errors—between 40% and 54% of conversations included at least one instance of inaccurate or misleading feedback. When the questions became simpler and more direct, error rates dropped significantly to between 6% and 27%, but mistakes still occurred.

One recurring problem was that SmartTest sometimes confirmed incorrect answers before correcting them, which could confuse students rather than clarify concepts.

The Hidden Effort Behind the Scenes

Contrary to the expectation that AI tools save time, setting up SmartTest required extensive work. Educators had to engage in detailed prompt engineering and spend hours manually checking the chatbot’s output. This level of effort raises questions about the practical value of AI tutors for teaching staff already pressed for time.

Unpredictability Undermines Trust

The chatbot’s inconsistency was another major issue. Identical questions sometimes received accurate feedback and other times confusing or incorrect responses. Reliability is crucial in education, and this unpredictability poses a challenge.

To see if newer AI models performed better, the team swapped the underlying engine from ChatGPT-4 to ChatGPT-4.5 (released in 2025). Surprisingly, the newer model did not consistently improve feedback quality. In some cases, it was less accurate, showing that advances in AI don’t automatically result in better teaching tools.

Implications for Students and Educators

Generative AI may have a place in low-stakes, formative assessments where immediate feedback can support learning. Students appreciated SmartTest’s conversational style and quick responses, with some noting it reduced anxiety and made them more comfortable admitting uncertainty.

However, the risk remains that incorrect or misleading answers could reinforce misunderstandings. When asked about their preferences, 76% of students valued having SmartTest as a practice option. Yet only 27% preferred instant AI feedback over waiting for human tutor feedback—even if that took days. Nearly half favored human feedback despite the wait, highlighting a trust gap.

Proceeding with Caution

These results suggest AI chatbots in education are still experimental tools. They can support learning but are not yet reliable enough to replace human educators or serve in high-stakes settings. Without careful oversight, AI feedback could do more harm than good.

For those interested in how AI can be integrated effectively into educational environments, ongoing research and development efforts are essential. Meanwhile, educators should weigh the benefits against the limitations and maintain a critical eye on AI’s role in teaching.

To explore practical AI applications and training options for educators, visit Complete AI Training.


Get Daily AI News

Your membership also unlocks:

700+ AI Courses
700+ Certifications
Personalized AI Learning Plan
6500+ AI Tools (no Ads)
Daily AI News by job industry (no Ads)
Advertisement
Stream Watch Guide