NVIDIA HGX AI Supercomputer

The world’s leading AI computing platform.

Introduction
Accelerated Computing
Inference
Training
Networking
Specifications

Introduction
Accelerated Computing
Inference
Training
Networking
Specifications

Purpose-Built for AI and HPC

AI, complex simulations, and massive datasets require multiple GPUs with extremely fast interconnections and a fully accelerated software stack. The NVIDIA HGX™ AI supercomputing platform brings together the full power of NVIDIA GPUs, NVLink®, NVIDIA networking, and fully optimized AI and high-performance computing (HPC) software stacks to provide the highest application performance and drive the fastest time to insights.

Unmatched End-to-End Accelerated Computing Platform

The NVIDIA HGX B200 and HGX B100 integrate NVIDIA Blackwell Tensor Core GPUs with high-speed interconnects to propel the data center into a new era of accelerating computing and generative AI. As a premier accelerated scale-up platform with up to 15X more inference performance than the previous generation, Blackwell-based HGX systems are designed for the most demanding generative AI, data analytics, and HPC workloads.

The NVIDIA HGX H200 combines H200 Tensor Core GPUs with high-speed interconnects to deliver extraordinary performance, scalability, and security for every data center. Configurations of up to eight GPUs deliver unprecedented acceleration, with a staggering 32 petaFLOPS of performance to create the world’s most powerful accelerated scale-up server platform for AI and HPC.

Both the HGX H200 and HGX H100 include advanced networking options—at speeds up to 400 gigabits per second (Gb/s)—utilizing NVIDIA Quantum-2 InfiniBand and Spectrum™-X Ethernet for the highest AI performance. HGX H200 and HGX H100 also include NVIDIA® BlueField®-3 data processing units (DPUs) to enable cloud networking, composable storage, zero-trust security, and GPU compute elasticity in hyperscale AI clouds.

Deep Learning Inference: Performance and Versatility

Real-Time Inference for the Next Generation of Large Language Models

Projected performance subject to change. Token-to-token latency (TTL) = 50 milliseconds (ms) real time, first token latency (FTL) = 5s, input sequence length = 32,768, output sequence length = 1,028, 8x eight-way NVIDIA HGX™ H100 GPUs air-cooled vs. 1x eight-way HGX B200 air-cooled, per GPU performance comparison.

HGX B200 achieves up to 15X higher inference performance over the previous NVIDIA Hopper™ generation for massive models such as GPT-MoE-1.8T. The second-generation Transformer Engine uses custom Blackwell Tensor Core technology combined with TensorRT™-LLM and Nemo™ Framework innovations to accelerate inference for large language models (LLMs) and Mixture-of-Experts (MoE) models.

Deep Learning Training: Performance and Scalability

Next-Level Training Performance

Projected performance subject to change. 32,768 GPU scale, 4,096x eight-way HGX H100 air-cooled cluster: 400G InfiniBand (IB) network, 4,096x 8-way HGX B200 air-cooled cluster: 400G IB network.

The second-generation Transformer Engine, featuring 8-bit floating point (FP8) and new precisions, enables a remarkable 3X faster training for large language models like GPT-MoE-1.8T. This breakthrough is complemented by fifth-generation NVLink with 1.8TB/s of GPU-to-GPU interconnect, InfiniBand networking, and NVIDIA Magnum IO™ software. Together, these ensure efficient scalability for enterprises and extensive GPU computing clusters.

Accelerating HGX With NVIDIA Networking

The data center is the new unit of computing, and networking plays an integral role in scaling application performance across it. Paired with NVIDIA Quantum InfiniBand, HGX delivers world-class performance and efficiency, which ensures the full utilization of computing resources.

For AI cloud data centers that deploy Ethernet, HGX is best used with the NVIDIA Spectrum-X networking platform, which powers the highest AI performance over 400Gb/s Ethernet. Featuring NVIDIA Spectrum™-4 switches and BlueField-3 DPUs, Spectrum-X delivers consistent, predictable outcomes for thousands of simultaneous AI jobs at every scale through optimal resource utilization and performance isolation. Spectrum-X enables advanced cloud multi-tenancy and zero-trust security. As a reference design for NVIDIA Spectrum-X, NVIDIA has designed Israel-1, a hyperscale generative AI supercomputer built with Dell PowerEdge XE9680 servers based on the NVIDIA HGX H200 or H100 eight-GPU platform, BlueField-3 DPUs, and Spectrum-4 switches.

Connecting HGX With NVIDIA Networking

	NVIDIA Quantum-2 InfiniBand Platform: Quantum-2 Switch, ConnectX-7 Adapter, BlueField-3 DPU	NVIDIA Spectrum-X Platform: Spectrum-4 Switch, BlueField-3 DPU, Spectrum-X License	NVIDIA Spectrum Ethernet Platform: Spectrum Switch, ConnectX Adapter, BlueField DPU
Deep Learning Training	Best	Better	Good
Scientific Simulation	Best	Better	Good
Data Analytics	Best	Better	Good
Deep Learning Inference	Best	Better	Good

NVIDIA HGX Specifications

NVIDIA HGX is available in single baseboards with four H200 or H100 GPUs or eight H200, H100, B200, or B100 GPUs. These powerful combinations of hardware and software lay the foundation for unprecedented AI supercomputing performance.

Blackwell
Hopper

	HGX B200	HGX B100
GPUs	HGX B200 8-GPU	HGX B100 8-GPU
Form factor	8x NVIDIA B200 SXM	8x NVIDIA B100 SXM
HPC and AI compute (FP64/TF32/FP16/FP8/FP4)*	320TF/18PF/36PF/72PF/144PF	240TF/14PF/28PF/56PF/112PF
Memory	Up to 1.4TB	Up to 1.4TB
NVIDIA NVLink	Fifth generation	Fifth generation
NVIDIA NVSwitch™	Fourth generation	Fourth generation
NVSwitch GPU-to-GPU bandwidth	1.8TB/s	1.8TB/s
Total aggregate bandwidth	14.4TB/s	14.4TB/s

	HGX H200
	4-GPU	8-GPU
GPUs	HGX H200 4-GPU	HGX H200 8-GPU
Form factor	4x NVIDIA H200 SXM	8x NVIDIA H200 SXM
HPC and AI compute (FP64/TF32/FP16/FP8/INT8)*	268TF/4PF/8PF/16PF/16 POPS	535TF/8PF/16PF/32PF/32 POPS
Memory	Up to 564GB	Up to 1.1TB
NVLink	Fourth generation	Fourth generation
NVSwitch	N/A	Third generation
NVSwitch GPU-to-GPU bandwidth	N/A	900GB/s
Total aggregate bandwidth	3.6TB/s	7.2TB/s

	HGX H100
	4-GPU	8-GPU
GPUs	HGX H100 4-GPU	HGX H100 8-GPU
Form factor	4x NVIDIA H100 SXM	8x NVIDIA H100 SXM
HPC and AI compute (FP64/TF32/FP16/FP8/INT8)*	268TF/4PF/8PF/16PF/16 POPS	535TF/8PF/16PF/32PF/32 POPS
Memory	Up to 320GB	Up to 640GB
NVLink	Fourth generation	Fourth generation
NVSwitch	N/A	Third generation
NVLink Switch	N/A	N/A
NVSwitch GPU-to-GPU bandwidth	N/A	900GB/s
Total aggregate bandwidth	3.6TB/s	7.2TB/s

* With sparsity

Read NVIDIA HGX H100 Datasheet

Find out more about the NVIDIA H200 Tensor Core GPU.

Learn More