New leaderboard shows which AI models use the most energy-and how to cut it

New U-M tools and a leaderboard track how much electricity AI models use on real tasks. Early results show 300x gaps, plus longer token outputs drive the bill.

Categorized in: AI News Science and Research

Published on: Feb 24, 2026

AI energy use: New tools show which model consumes the most electricity-and why

February 23, 2026

AI's energy bill is no longer guesswork. Open-source software and an online leaderboard from the University of Michigan now let users and developers measure the electricity different AI models consume for common tasks-chat, image and video generation, problem solving, and coding.

Teams can run the software on their own hardware to evaluate private and open-weight models. While it can't measure queries sent to proprietary services inside private data centers, it enables apples-to-apples comparisons for open-weight models where parameters are publicly available.

Why this matters

Most of AI's electricity use-roughly 80% to 90%-happens during inference, not training. As models get larger and usage grows, the strain scales with it. In 2024, U.S. data centers consumed about 4% of the nation's electricity, with demand projected to roughly double by 2030.

Despite the stakes, popular AI benchmarks don't report energy use. Instead, rough "envelope" estimates multiply a GPU's maximum power draw by the number of GPUs-useful for a ceiling, but disconnected from how models actually run.

"If you want to optimize energy efficiency and minimize environmental impact, knowing the energy requirements of the models is critical, but popular benchmarks for assessing AI ignore this aspect of performance," said Mosharaf Chowdhury, associate professor of computer science and engineering.

What the Michigan team built

The group developed open-source measurement tools and an online leaderboard that captures model-by-model energy use on real tasks. Their latest leaderboard update surfaced wide gaps in consumption: for some tasks, open-weight models differ by up to 300x in energy required.

The team has also produced tutorials to help practitioners measure and reduce energy costs, including material presented at the NeurIPS Conference.

Key finding: tokens drive the bill

A core driver of energy use is the number of generated tokens. Large language models that generate wordier outputs burn more electricity than concise ones. Reasoning-focused models also consume more because they produce longer "chains of thought," often 10-100x more tokens per request.

How a model is run matters too. Batching queries reduces total data center energy, though larger batches increase latency. Even the choice of memory allocation software can change a model's energy footprint.

"There are many ways to deploy AI and translate what the model wants to do into computations on the hardware," said Jae-Won Chung, doctoral student and first author. "Our tool can automate the search through that parameter space and find the most efficient set of parameters based on the user's needs."

From estimates to measurements

"A lot of people are concerned about AI's growing energy use, which is fair," said Chowdhury. "However, many who worry can be overly pessimistic, and those who want more data centers are often overly optimistic. The reality is not black and white, and there's a lot we don't know because nobody is making direct measurements of AI power use available. Our tool can provide more accurate data for better decision-making."

In other words: move away from theoretical ceilings and measure what your stack actually does under load.

Practical steps for researchers and developers

Measure on your hardware: Use energy measurement tools to profile your models on representative workloads. Capture per-request kWh across chatting, coding, and generation tasks.
Control tokens: Cap max tokens and prefer concise decoding strategies where feasible. If your use case doesn't need long reasoning traces, limit chain-of-thought generation.
Tune batching intentionally: Increase batch size to reduce energy per request when latency budgets allow. Validate the throughput/latency/energy trade-off with real traffic.
Test deployment parameters: Evaluate different memory allocators and runtime settings. Small scheduling and memory decisions can shift energy use meaningfully.
Report energy alongside accuracy: When benchmarking, include energy per task and per token. Track changes over time as models, prompts, and configs evolve.

What we still don't know

The picture remains incomplete for proprietary models running in private data centers, where direct energy data isn't reported. As demand grows, better transparency and standardized reporting will help calibrate policy, procurement, and infrastructure planning.

People and place

The work is led by Mosharaf Chowdhury and his team, including Jae-Won Chung, at U-M's Michigan Academic Computing Center, a two-megawatt facility used for academic research in Ann Arbor, Michigan.

Funding and support

The project received partial support from the National Science Foundation, with additional grants and gifts from VMware, the Mozilla Foundation, Cisco, Ford, GitHub, Salesforce, Google, and the Kwanjeong Educational Foundation.

Further resources

Get Daily AI News

Your membership also unlocks:

700+ AI Courses

700+ Certifications

Personalized AI Learning Plan

6500+ AI Tools (no Ads)

Daily AI News by job industry (no Ads)

New leaderboard shows which AI models use the most energy-and how to cut it

AI energy use: New tools show which model consumes the most electricity-and why

Why this matters

What the Michigan team built

Key finding: tokens drive the bill

From estimates to measurements

Practical steps for researchers and developers

What we still don't know

People and place

Funding and support

Further resources

Related AI News for Science and Research

Brightseed launches enterprise platform connecting health sciences discovery to commercialization

Stanford researcher finds AI useful for spotting errors in peer review but unreliable on scientific judgment

AI system generates research paper that passes peer review at machine learning conference workshop

NSF launches AI-Ready America initiative to build workforce and business skills across all 50 states

About Complete AI:

Latest AI News for your Job:

Courses by AI Skill:

Courses by Job Field:

Courses by AI Company:

AI Tools for your Job:

AI Tools by Type:

AI Certifications by Skill:

AI Certifications by Job Field:

AI Certifications by Company: