AMD Unveils MI350 Series GPUs, Helios Rack Architecture, and ROCm 7 to Challenge NVIDIA in AI Market

AMD’s new Instinct MI350 GPUs offer up to 4× AI performance and 35× faster inferencing. Their Helios rack architecture and ROCm 7 software boost AI infrastructure and energy efficiency.

Categorized in: AI News IT and Development
Published on: Jun 14, 2025
AMD Unveils MI350 Series GPUs, Helios Rack Architecture, and ROCm 7 to Challenge NVIDIA in AI Market

AMD's Key Highlights from Advancing AI 2025

AMD made significant announcements this week at their Advancing AI 2025 event, stepping up their response to NVIDIA's dominant hold on the GPU and AI markets. The company offered a glimpse into their upcoming EPYC CPUs and Instinct GPUs, signaling strong moves to compete in AI infrastructure and software.

Introducing AMD Instinct MI350 Series GPUs

AMD unveiled the Instinct MI350 Series GPUs, which deliver up to 4 times the AI compute performance compared to the previous generation, and up to 35 times faster inferencing. These GPUs come with 288GB of HBM3E memory and bandwidth reaching 8TB/s. They support both air-cooled and direct liquid-cooled setups, scaling to 64 GPUs in air-cooled racks and 128 GPUs in liquid-cooled racks.

The MI350 series can achieve up to 2.6 exaFLOPS of FP4/FP6 performance within standard infrastructure, marking the largest generational leap AMD has made for Instinct GPUs. AMD is also already developing the MI400, expected in 2026, which is designed as a rack-level solution to further enhance system-level performance.

Helios Rack Scale Architecture

AMD announced the Helios rack scale architecture, arriving next year. Helios integrates multiple next-gen AMD technologies:

  • Instinct MI400 Series GPUs: Expected to feature 432 GB HBM4 memory, 40 petaflops of FP4 compute, and 300 GB/s scale-out bandwidth. The GPUs connect using UALink, an open standard enabling up to 72 GPUs in a rack to communicate as a unified system.
  • 6th Gen EPYC “Venice” CPUs: Built on the Zen 6 architecture, these CPUs will offer up to 256 cores, 1.7 times the performance of previous generations, and 1.6 TB/s memory bandwidth.
  • AMD Pensando “Vulcano” AI NICs: Compliant with UEC 1.0 standards, supporting PCIe and UALink interfaces, 800G network throughput, and delivering 8 times the scale-out bandwidth per GPU compared to prior generations.

Software Advances with ROCm 7 and AMD Developer Cloud

One of NVIDIA's biggest advantages has been its CUDA platform, which dominates AI software development. AMD is addressing this by launching ROCm 7 and the AMD Developer Cloud, focusing on accessibility and scalability for developers.

ROCm 7 will be generally available in Q3 2025, providing over 3.5 times the inference capability and 3 times the training power compared to ROCm 6. It supports lower precision data types like FP4 and FP6 and simplifies installation with a straightforward pip install. AMD is also emphasizing collaboration with open-source AI frameworks such as SGLang, vLLM, and llm-d to improve distributed inference.

Additionally, AMD introduced ROCm Enterprise AI, an MLOps platform tailored for enterprise AI workflows. It includes tools for model tuning with industry-specific data and integration with both structured and unstructured workflows, supported by partnerships developing reference applications like chatbots and document summarization.

Energy Efficiency at Rack Scale

AMD reported exceeding its "30×25" energy efficiency goal, achieving a 38-fold increase in node-level energy efficiency for AI training and HPC workloads. This translates to a 97% reduction in energy use for the same performance compared to systems from five years ago.

Looking ahead, AMD set a 2030 target to improve rack-scale energy efficiency by 20 times from 2024 levels. This could allow training AI models that today require over 275 racks to be completed in under one rack, using 95% less electricity. Combined with software and algorithmic improvements, this may yield up to a 100-fold overall efficiency boost.

Open Rack Scale AI Infrastructure

AMD's rack architecture integrates its 5th Gen EPYC CPUs, Instinct MI350 GPUs, and scale-out networking with AMD Pensando Pollara AI NICs. This setup aligns with industry standards such as the Open Compute Project and Ultra Ethernet Consortium, supporting both liquid and air-cooled configurations.

By combining these hardware components into a unified rack solution, AMD aims to deliver high-performance AI infrastructure that meets the demands of modern AI workloads.

For IT and development professionals looking to expand their skills in AI infrastructure and software, exploring courses and certifications focused on AI hardware and software platforms can be beneficial. Resources like Complete AI Training’s list of AI courses by leading companies offer practical options to stay current with industry trends.


Get Daily AI News

Your membership also unlocks:

700+ AI Courses
700+ Certifications
Personalized AI Learning Plan
6500+ AI Tools (no Ads)
Daily AI News by job industry (no Ads)
Advertisement
Stream Watch Guide