Study finds top AI models struggle badly at sports analysis, scoring just 5% on post-game reasoning tasks

A new study found leading AI models, including ChatGPT and Gemini, achieved just 5% accuracy on sports broadcast analysis tasks. They're decent at describing action but fail at explaining why plays happen or predicting what comes next.

Categorized in: AI News Science and Research
Published on: Jun 08, 2026
Study finds top AI models struggle badly at sports analysis, scoring just 5% on post-game reasoning tasks

Study: Top AI Models Fail at Sports Analysis

Researchers at the University of North Carolina at Chapel Hill and Northeastern University tested how well leading AI systems perform at analyzing professional sports. The findings are clear: they're terrible at it.

The study examined four core capabilities - perception, reasoning, simulation, and decision-making - using a dataset called SVI-bench. The researchers compiled 35,000 hours of basketball, soccer, and hockey footage, 15 million annotated plays, 15,000 hours of professional commentary, 23,000 post-game reports, and 103,000 statistical records.

Where AI Performs Best - and Still Fails

Perception was the models' strongest area. ChatGPT, Google's Gemini, and the open-source model Qwen correctly identified which player performed which action roughly 74 percent of the time. That's the benchmark where they performed best.

A professional announcer with a 74 percent accuracy rate would be fired.

Reasoning and Prediction Collapse

Performance dropped sharply when researchers asked the models to explain why plays happened the way they did. Success rates fell to around 40 percent on average.

In one test, researchers showed ChatGPT a Cody Martin three-pointer that bounced off the top of the backboard before going in - an unusual shot. The model said the unusual part was that it was "his first made three of the game." The actual unusual element was the trajectory.

Simulation tasks - predicting where a player would move based on their current trajectory - were worse. The best-performing model performed at roughly the level of random chance when asked to predict a player's next position. Accuracy dropped further when researchers asked the models to predict longer sequences of movement toward a goal.

Complex Analysis Nearly Impossible

When asked to perform the work of a human broadcaster - analyzing post-game statistics and trends to draw conclusions - the models achieved just 5 percent accuracy.

Lorenzo Torresani, a computer science researcher at Northeastern and study co-author, said the gap reveals a fundamental limitation: "AI cannot tell you why things happen, and it cannot tell you what's gonna happen next."

A competent sportscaster does more than describe what's visible. They explain why a play worked, anticipate what comes next, and identify which moments matter. The research shows AI handles description reasonably well but fails at everything else.

Implications Beyond Sports

The findings extend far beyond broadcasting. Torresani noted that the same limitation appears in any job where value comes from understanding causation, making predictions, determining significance, and recommending action - rather than simply describing what's visible.

For knowledge workers concerned about AI automation, the study offers a concrete example of where AI systems hit a wall. The gap isn't narrow. It's fundamental.

Learn more about AI capabilities and limitations through AI Research Courses and Generative AI and LLM Courses.


Get Daily AI News

Your membership also unlocks:

700+ AI Courses
700+ Certifications
Personalized AI Learning Plan
6500+ AI Tools (no Ads)
Daily AI News by job industry (no Ads)