KAYTUS introduces KSManage Ultra platform for AI data centers

KAYTUS launched KSManage Ultra to unify AI data center compute, power, and cooling. The platform cuts rack deployment time from 50 minutes to under 3 minutes.

Categorized in: AI News Management
Published on: Jun 30, 2026
KAYTUS introduces KSManage Ultra platform for AI data centers

KAYTUS introduced KSManage Ultra, an AI infrastructure management platform for large-scale data centers, at ISC 2026 in Frankfurt. The platform brings compute, networking, power, and liquid cooling under a single management system - a response to the operational challenges that arise when AI racks pack hundreds of GPUs alongside complex power and thermal infrastructure.

In high-density AI environments, performance issues rarely trace back to one defective part. They emerge from interactions between hardware, software, networking, and cooling. Traditional server management tools, designed for simpler setups, miss these interdependencies. KSManage Ultra is built to correlate data across all these layers.

Unified visibility across hardware and cooling

The platform provides centralized management of GPUs, CPUs, memory, networking hardware, power distribution units, cooling distribution units, racks, and clusters. It uses both in-band and out-of-band management in a single interface. By linking operating system and application data with hardware health, power consumption, temperature readings, and liquid cooling status, operators can spot problems before they affect AI workloads.

For liquid cooling, the system monitors for leaks at multiple levels. When a leak is detected, it can automatically shut down affected nodes, isolate the equipment, and send alerts. This automated response reduces the time between fault detection and containment in environments where a single leak can damage expensive GPU clusters.

Faster deployment and automated health checks

KAYTUS says KSManage Ultra shortens rack deployment dramatically. Using batch scanning and automatic topology mapping, the onboarding process for a rack can shrink from around 50 minutes to under three minutes. Template-based workflows automate stress testing, driver installation, hardware configuration, and software deployment. This speed lets teams provision compute capacity faster as AI projects scale.

The platform continuously monitors compute resource health. It evaluates GPUs, memory, PCIe links, networking consistency, firmware versions, cooling, and power infrastructure. Nodes flagged as faulty or high-risk are isolated before they are assigned to training or inference workloads. The result is higher resource utilization and better system availability - direct levers for controlling operational cost in AI data centers.

Open architecture for heterogeneous environments

KSManage Ultra features an open architecture with APIs that integrate with scheduling platforms, configuration management databases, servers, networking equipment, and cooling systems. KAYTUS designed the platform to provide unified management across mixed-vendor infrastructure, which the company calls AI Factories. The API-first approach means organizations can connect the platform to existing operational tools rather than replacing them.

Why this matters for management

Deploying a rack in three minutes instead of 50 directly accelerates AI project timelines. When a new GPU cluster is needed for a large training run, the difference between waiting nearly an hour and waiting a few minutes compounds across dozens of racks. Automated node isolation prevents faulty hardware from silently degrading model training jobs, which can waste days of compute time. And unified visibility into power, cooling, and network health gives managers the data they need to make capacity decisions without logging into five separate tools.

For those overseeing AI for Management, the platform's correlation of hardware telemetry with application performance means fewer escalations to specialist teams and fewer reactive firefights when something breaks. As AI infrastructure scales, these operational efficiencies become harder to achieve without a management system that spans the full stack.


Get Daily AI News

Your membership also unlocks:

700+ AI Courses
700+ Certifications
Personalized AI Learning Plan
6500+ AI Tools (no Ads)
Daily AI News by job industry (no Ads)