Menu

AMD, OpenAI and Microsoft Introduce New AI Networking Protocol to Power Massive GPU Clusters

Rebecca PY 18 seconds ago

AMD, alongside OpenAI, Microsoft and other industry players, has contributed the Multipath Reliable Connection (MRC) protocol to the Open Compute Project to improve AI networking performance at scale. The new protocol is designed to enhance reliability, reduce congestion, and maximize GPU utilization for large-scale AI training environments.


MALAYSIA, 8 MAY 2026 – AMD has announced a major step forward in AI infrastructure networking with the introduction of Multipath Reliable Connection (MRC), a new open networking protocol developed in collaboration with OpenAI, Microsoft, Broadcom, Intel and other industry leaders.

Contributed to the Open Compute Project (OCP), MRC is designed to address one of the biggest bottlenecks in scaling advanced AI systems — the network infrastructure connecting hundreds of thousands of GPUs working simultaneously in large AI clusters.

As AI models become increasingly complex, synchronized communication between GPUs has become critical. Traditional single-path networking approaches often struggle under the demands of large-scale AI training, where even minor disruptions or latency spikes can impact overall system performance.

MRC introduces a multi-path networking approach that distributes data packets across several paths simultaneously instead of relying on a single route. This helps reduce congestion, minimize latency variation, and improve resilience by enabling near real-time rerouting when failures occur.

According to AMD, the protocol effectively transforms the network into a “shock absorber” for AI infrastructure, allowing workloads to continue operating efficiently even during disruptions.

The company said performance in AI infrastructure is no longer measured solely by peak bandwidth, but by how consistently systems can maintain productive GPU utilization under real-world conditions.

AMD also revealed that it played a key role in shaping the MRC specification and contributed advanced congestion control technologies to improve network performance. The company has already implemented and tested MRC at scale with a leading cloud provider using its AMD Pensando Pollara 400 AI NIC technology.

Krishna Doddapaneni, Corporate Vice President of Engineering at AMD’s Networking Technology Solutions Group, said networking has become the primary challenge in scaling AI infrastructure, surpassing compute limitations alone.

AMD added that the programmability of its networking solutions enabled early validation of MRC and positions the company to support future transitions to its upcoming AMD Pensando Vulcano 800G AI NIC platform.

By contributing MRC as an open standard, AMD and its partners aim to create a more programmable, resilient, and production-ready networking foundation for the next generation of AI infrastructure across cloud, enterprise, research, and sovereign AI deployments.

%d