Amazon drops AI usage leaderboard after employees inflate token counts to game rankings

Amazon shut down its internal AI leaderboard after employees gamed it by running pointless tasks to inflate token counts. The company now tracks working code shipped to production instead.

Amazon Kills Internal AI Leaderboard After Staff Gamed the System

Amazon has removed an internal AI usage leaderboard after employees inflated token consumption on low-value tasks, forcing the company to rethink how it measures AI adoption. The initiative, called Kirorank, ranked staff based on how often they used AI tools. Instead of driving meaningful adoption, it created perverse incentives that wasted computing resources.

Dave Treadwell, Senior Vice President at Amazon, told staff the leaderboard encouraged "tokenmaxxing" - inflating AI token consumption regardless of business value. "Please do not use AI just for the sake of using AI," he said.

Tokens are units of data processed by AI models. Each meaningless task consumes computing capacity Amazon must pay for, turning the scoreboard into overhead rather than a signal of useful work.

How the Incentive Failed

Employees reportedly assigned AI agents - autonomous bots that act on a user's behalf - to pointless tasks to climb the rankings. Some used internal tools like Kiro and MeshClaw to generate additional AI activity without shipping real code or delivering customer value.

Amazon had set targets requiring more than 80% of developers to use AI weekly. Without clear links to business outcomes, the metric invited activity that looked productive but created little value.

The stakes matter. Amazon expects to spend roughly $200 billion in capital expenditure, mostly on AI and data centre infrastructure. Rising compute costs make token waste expensive. Anthropic, whose models Amazon uses extensively, shifted from flat monthly fees to metered usage, increasing bills for heavy users.

Meta employees attempted similar gaming of internal tables, according to reporting on the issue.

The Replacement: Code Over Tokens

Amazon now tracks "normalised deployments" - evidence that engineers regularly use AI to create useful code that ships to production. Treadwell instructed staff to focus on building better products and shipping improvements customers notice, not on burning tokens.

Other leaders are adopting the same view. Ravi Kumar S, CEO at Cognizant, called token consumption a "vanity metric," saying the company measures results over usage.

Measuring deployments rewards teams for merging AI-assisted code into production rather than running experiments that never ship. It encourages thoughtful integration of AI into the software lifecycle.

What This Means for Managers and HR Leaders

This is a textbook incentive-design failure. Mandate a behaviour, attach a public scoreboard, and people will deliver the number whether or not it creates value.

The fix is straightforward: measure the outcome the business actually wants. In this case, that meant working code rather than tokens burned. When incentives align with real business goals, the pressure to game the system disappears.

Three principles emerge for AI for Management teams adopting new tools:

Define high-value use cases. Be specific about where AI should solve problems, not just where it can be used.
Align targets with delivery milestones. Tie adoption metrics to code shipped, features launched, or problems solved - not activity metrics.
Validate impact after deployment. Measure whether AI-assisted work actually improves speed, quality, or customer outcomes.

Clear ownership matters too. If dashboards drive behaviour, they must be approved, audited, and linked to goals that affect customers and the bottom line. Unofficial tools can drift from leadership intent, as Amazon's experience shows.

AI for Human Resources teams should note the broader lesson: adoption quality beats adoption quantity. A smaller group using AI thoughtfully on high-impact work creates more value than widespread usage on marginal tasks.

Amazon's shift from Kirorank to deployment tracking reflects a wider industry move toward outcomes-based measurement. As AI becomes infrastructure rather than novelty, the pressure to adopt matters less than the discipline to adopt well.

Get Daily AI News

Your membership also unlocks:

700+ AI Courses

700+ Certifications

Personalized AI Learning Plan

6500+ AI Tools (no Ads)

Daily AI News by job industry (no Ads)

Amazon drops AI usage leaderboard after employees inflate token counts to game rankings

Amazon Kills Internal AI Leaderboard After Staff Gamed the System

How the Incentive Failed

The Replacement: Code Over Tokens

What This Means for Managers and HR Leaders

Related AI News for Human Resources

Amazon drops AI usage leaderboard after employees inflate token counts to game rankings

Storyteller Anne Brashier says the next economy runs on care, not information

Passive AI use at work erodes employees' sense of meaning and ownership, Penn State study finds

WTW launches AI workforce transformation service to help clients identify productivity gains

Related AI News for Management

LG Uplus builds Korea's first hybrid-cooled hyperscale AI data center in Paju as telecoms race to expand capacity

Betterworks acquires Rypple to expand AI-driven manager coaching tools

Amazon drops AI usage leaderboard after employees inflate token counts to game rankings

Companies scramble to control AI spending as token costs spiral out of budget

About Complete AI:

Latest AI News for your Job:

Courses by AI Skill:

Courses by Job Field:

Courses by AI Company:

AI Tools for your Job:

AI Tools by Type:

AI Certifications by Skill:

AI Certifications by Job Field:

AI Certifications by Company: