AI can explain finance, but it stumbles on the math, SQU study finds
Artificial intelligence can clarify financial concepts, but it still falls short on numbers. A new study led by Sultan Qaboos University (with LUT Business School in Finland) tested whether tools like ChatGPT-4 can match student performance in corporate finance. The short answer: useful for theory, unreliable for calculations.
The experiment
Researchers pulled 60 multiple-choice questions from a standard corporate finance textbook-net present value, internal rate of return, time value of money, and financial ratios. These were taken from actual exams taken by 67 undergraduates at SQU. The same questions were posed to ChatGPT-4 on three separate runs to check consistency.
Results: strong on theory, weak on calculations
- Descriptive, theory-focused questions: ChatGPT averaged 87%; students averaged 75%.
- Calculation-based questions: students averaged 82%; ChatGPT scored 32%.
The gap wasn't random. The model often misapplied formulas, made arithmetic errors, or misread the structure of a problem. As the lead researcher put it, finance "requires logical sequencing, judgment and precision," which remain largely human strengths.
Why this matters for finance teams
Our work mixes narrative with math. You can let an AI draft a memo on WACC or explain IRR, but if it botches a cash flow schedule or mis-keys a rate, the decision that follows is exposed. Capital budgeting, valuations, credit work-these depend on clean logic, clean data, and clean math.
Use AI to speed comprehension and communication. Keep the numbers in systems you can audit.
Practical ways to use AI without tripping over the math
- Use it for quick refreshers, policy drafts, meeting notes, and first-pass commentary on ratios or trends.
- Do calculations in spreadsheets, Python, or R. Ask AI for a method outline, then implement the math in your model.
- Feed structured inputs (cash flows, timing, discount rates, assumptions). Require step-by-step working and reconcile outputs to your templates.
- Validate with independent checks: unit consistency, sign tests, order-of-magnitude sanity checks, and edge cases.
- Set thresholds for manual review on high-impact items (capex approvals, covenant headroom, valuation memos). Log prompts and outputs for audit trails.
- Train teams on common AI error patterns: formula drift, rounding cascades, misinterpreted problem framing.
Implications for education and hiring
The Research team urged a stronger focus on critical thinking, reasoning, and context. Case work, flipped classrooms, and applied learning help preserve those skills. Soft skills-communication, teamwork, problem-solving-matter, too, and AI can't replicate them.
For hiring, prioritize case interviews with real numbers. Ask candidates to show their working, defend assumptions, and explain trade-offs. You're testing sequence, judgment, and precision-not just the final answer.
What's next
Models will improve, including on numerical reasoning, but capability varies and drifts over time. The researchers called for ongoing comparisons across models to track progress. For now, the takeaway is simple: AI is a helpful aid; genuine learning-and sound financial decisions-still rely on human effort, judgment, and understanding.
Sources and further reading
Journal of Educational Technology Systems
Explore AI options for finance
If you're evaluating tools for your stack, here's a curated list: AI tools for finance.
Your membership also unlocks: