At Supercomputing 2024 (SC24), Enfabrica Company unveiled a milestone in AI information heart networking: the Accelerated Compute Cloth (ACF) SuperNIC chip. This 3.2 Terabit-per-second (Tbps) Community Interface Card (NIC) SoC redefines large-scale AI and machine studying (ML) operations by enabling large scalability, supporting clusters of over 500,000 GPUs. Enfabrica additionally raised $115 million in funding and is predicted to launch its (ACF) SuperNIC chip in Q1 2025.
Addressing AI Networking Challenges
As AI fashions develop more and more giant and complex, information facilities face mounting pressures to attach giant numbers of specialised processing models, equivalent to GPUs. These GPUs are essential for high-speed computation in coaching and inference however are sometimes left idle attributable to inefficient information motion throughout present community architectures. The problem lies in successfully interconnecting 1000’s of GPUs to make sure optimum information switch with out bottlenecks or efficiency degradation.
Conventional networking approaches can hyperlink roughly 100,000 AI computing chips in an information heart earlier than inefficiencies and slowdowns grow to be important. Based on Enfabrica’s CEO, Rochan Sankar, the corporate’s new expertise helps as much as 500,000 chips in a single AI/ML system, enabling bigger and extra dependable AI mannequin computations. By overcoming the constraints of standard NIC designs, Enfabrica’s ACF SuperNIC maximizes GPU utilization and minimizes downtime.
Key Improvements within the ACF SuperNIC
The ACF SuperNIC boasts a number of industry-first options tailor-made to trendy AI information heart wants:
- Excessive-Bandwidth, Multi-Port Connectivity: The ACF SuperNIC delivers multi-port 800-Gigabit Ethernet to GPU servers, quadrupling the bandwidth in comparison with different GPU-attached NICs. This setup supplies unprecedented throughput and enhances multipath resiliency, guaranteeing strong communication throughout AI clusters.
- Environment friendly Two-Tier Community Design: With a high-radix configuration of 32 community ports and as much as 160 PCIe lanes, the ACF SuperNIC simplifies the general structure of AI information facilities. This effectivity permits operators to assemble large clusters utilizing fewer tiers, lowering latency and bettering information switch effectivity throughout GPUs.
- Scaling Up and Scaling Out: The Enfabrica ACF SuperNIC, with its high-radix, high-bandwidth, and concurrent PCIe/Ethernet multipathing and information mover capabilities, can uniquely scale up and scale out 4 to eight latest-generation GPUs per server system. This considerably will increase AI clusters’ efficiency, scale, and resiliency, guaranteeing optimum useful resource utilization and community effectivity.
- Built-in PCIe Interface: The chip helps 128 to 160 PCIe lanes, delivering speeds over 5 Tbps. This design permits a number of GPUs to hook up with a single CPU whereas sustaining high-speed communication with information heart backbone switches. The result’s a extra environment friendly and versatile structure that helps large-scale AI workloads.
- Resilient Message Multipathing (RMM): Enfabrica’s proprietary RMM expertise boosts the reliability of AI clusters. By mitigating the impression of community hyperlink failures or flaps, RMM prevents job stalls, guaranteeing smoother and extra environment friendly AI coaching processes. Sankar notes the significance of this characteristic, particularly in giant setups the place hyperlinks to switches failures grow to be frequent.
- Software program-Outlined RDMA Networking: This distinctive characteristic empowers information heart operators with full-stack programmability and debuggability, bringing the advantages of software-defined networking (SDN) into Distant Direct Reminiscence Entry (RDMA) setups. It permits customization of the transport layer, which might optimize cloud-scale community topologies with out sacrificing efficiency.
Enhanced Resiliency and Effectivity
Conventional techniques usually require one-to-one connections between GPUs and numerous parts, equivalent to PCIe switches and RDMA NICs. Nevertheless, because the variety of GPUs in a system will increase, the chance of hyperlinks to switches failures grows, with potential disruptions occurring as usually as each 23 minutes in setups with over 100,000 GPUs, based on Shankar.
The ACF SuperNIC addresses this concern by enabling a number of connections from GPUs to switches. This redundancy minimizes the impression of particular person element failures, boosting system uptime and reliability.
The SuperNIC additionally introduces the Collective Reminiscence Zoning characteristic, which helps zero-copy information transfers and optimizes host memory management. By lowering latency and enhancing reminiscence effectivity, this expertise maximizes the floating-point operations per second (FLOPs) utilization of GPU server fleets.
Scalability and Operational Advantages
The ACF SuperNIC’s design isn’t solely about scale but in addition about operational effectivity. It supplies a software program stack that integrates with customary communication, present interfaces, and RDMA networking operations. This compatibility ensures environment friendly deployment throughout various AI compute environments composed of GPUs and accelerators (AI chips) from completely different distributors. Knowledge heart operators profit from streamlined networking infrastructure, lowering complexity and enhancing the flexibleness of their AI information facilities.
Availability and Future Prospects
Enfabrica’s ACF SuperNIC shall be out there in restricted portions in Q1 2025, with each the chips and pilot techniques now open for orders by means of Enfabrica and chosen companions. As AI fashions demand greater efficiency and bigger scales, Enfabrica’s progressive method may play a pivotal function in shaping the subsequent technology of AI information facilities designed to help Frontier AI models.
Filed in AI (Artificial Intelligence), Chip, generative AI, Semiconductors, Server, SoC and Supercomputer.
. Learn extra aboutTrending Merchandise

Acer Nitro KG241Y Sbiip 23.8â Full HD (1920 x 1080) VA Gaming Monitor | AMD FreeSync Premium Technology | 165Hz Refresh Rate | 1ms (VRB) | ZeroFrame Design | 1 x Display Port 1.2 & 2 x HDMI 2.0,Black

Cudy TR3000 Pocket-Sized Wi-Fi 6 Wireless 2.5Gb Travel Router | WiFi Router | OpenVPN, Wireguard, Connect to Public & Hotel Wi-Fi login Page, RV

15.6” Laptop computer 12GB DDR4 512GB SSD, Home windows 11 Quad-Core Intel Celeron N5095 Processors, 1080P IPS FHD Show Laptop computer Pc,Numeric Keypad USB 3.0, Bluetooth 4.2, 2.4/5G WiFi

HP 27h Full HD Monitor – Diagonal – IPS Panel & 75Hz Refresh Rate – Smooth Screen – 3-Sided Micro-Edge Bezel – 100mm Height/Tilt Adjust – Built-in Dual Speakers – for Hybrid Workers,Black

HP 17 Laptop, 17.3â HD+ Display, 11th Gen Intel Core i3-1125G4 Processor, 32GB RAM, 1TB SSD, Wi-Fi, HDMI, Webcam, Windows 11 Home, Silver

TP-Link AXE5400 Tri-Band WiFi 6E Router (Archer AXE75)- Gigabit Wireless Internet Router, ax Router for Gaming, VPN Router, OneMesh, WPA3

GAMDIAS White RGB Gaming ATX Mid Tower Computer PC Case with Side Tempered Glass and Excellent Airflow Design & 3 Built-in 120mm ARGB Fans

ViewSonic VA2447-MH 24 Inch Full HD 1080p Monitor with Ultra-Thin Bezel, Adaptive Sync, 75Hz, Eye Care, and HDMI, VGA Inputs for Home and Office

Dell S2722DGM Curved Gaming Monitor – 27-inch QHD (2560 x 1440) 1500R Curved Display, 165Hz Refresh Rate (DisplayPort), HDMI/DisplayPort Connectivity, Height/Tilt Adjustability – Black
