NVIDIA Rubin R100
(H300) GPU
288 GB HBM4 · 22 TB/s · 50 PFLOPS FP4
The Rubin R100 is the generational successor to B300, built for trillion-parameter inference at FP4 precision and multi-node training runs where Blackwell memory bandwidth becomes the bottleneck.
No pricing published yet. Cloud providers start shipping R100 (also called H300) in H2 2026. Register your interest to be notified first when Spheron capacity opens.
Register Interest
Join the pipeline for R100 access. We'll reach out with pricing and availability as soon as capacity goes live.
The NVIDIA Rubin R100 (also branded H300 by cloud providers) is the generational successor to B300 Blackwell Ultra. It ships with 288GB HBM4 at up to 22 TB/s bandwidth (2.75x faster than B300), 50 PFLOPS FP4 compute (3.33x B300), NVLink 6 at 3.6 TB/s per GPU, and ConnectX-9 networking. First cloud availability is H2 2026 for AWS, Google Cloud, Azure, and specialist providers. Spheron is onboarding R100 capacity and will contact registered teams with pricing as soon as it is confirmed. For workloads running today, B300 and B200 are available now.
NVIDIA GPU generation roadmap
Where R100 sits in the NVIDIA GPU generation stack. All generations prior to Rubin are available on Spheron today.
R100 GPU specifications
R100 specs sourced from NVIDIA GTC 2025 roadmap and confirmed at CES 2026 and GTC 2026. Memory bandwidth (up to 22 TB/s), NVLink 6, and ConnectX-9 are officially confirmed. FP8 throughput (~16,000 TFLOPS) is derived from NVIDIA's NVL72 system-level spec. Final per-GPU specs will be available when cloud providers ship production systems in H2 2026.
R100 vs B300 vs B200 vs H100
| Spec | R1002026 | B300 | B200 | H100 |
|---|---|---|---|---|
| Architecture | Rubin | Blackwell Ultra | Blackwell | Hopper |
| VRAM | 288 GB HBM4 | 288 GB HBM3e | 192 GB HBM3e | 80 GB HBM3 |
| Memory Bandwidth | Up to 22 TB/s | 8 TB/s | 8 TB/s | 3.35 TB/s |
| FP4 Compute | 50 PFLOPS | 15 PFLOPS | 9 PFLOPS | N/A |
| FP8 Throughput | ~16,000 TFLOPS | 7,000 TFLOPS | 4,500 TFLOPS | ~2,000 TFLOPS |
| Interconnect | NVLink 6 (3.6 TB/s) | NVLink 5 (1.8 TB/s) | NVLink 5 (1.8 TB/s) | NVLink 4 (900 GB/s) |
| Transistors | 336 billion | 208 billion | 208 billion | 80 billion |
| Cloud Availability | H2 2026 (first cohort) | Available now | Available now | Available now |
R100 FP8 throughput (~16,000 TFLOPS) derived from NVIDIA's NVL72 system-level spec. All other R100 specs confirmed at CES 2026 and GTC 2026. Availability reflects first cohort; broader availability from additional providers follows.
Workloads built for R100
Trillion-Parameter FP4 Inference
At 50 PFLOPS FP4 and 288GB HBM4, a single R100 can hold and serve a 200B parameter model in FP4 with 88GB of headroom for KV cache. Multi-GPU setups handle 400B+ models that currently require 4x B300 at FP8.
Frontier Model Pre-Training
3.33x the FP4 compute of B300 and 2.75x the memory bandwidth. Training runs that require 8x B300 nodes may fit on fewer R100 nodes, reducing inter-node communication overhead and wall time.
Rack-Scale NVL72 Workloads
In the Vera Rubin NVL72 configuration, 72 R100 GPUs share 260 TB/s NVLink fabric and 20.7TB of aggregate HBM4. Models that require multi-node sharding on Blackwell may fit in a single NVL72 rack.
High-Bandwidth Inference Serving
22 TB/s memory bandwidth is 2.75x B300 on a single GPU. Decode-phase throughput scales nearly linearly with bandwidth for memory-bound LLM serving. At equivalent batch sizes, R100 serves 2.5–3x more tokens per second than B300.
When to pick the R100
Pick R100 if
You're training or serving 400B+ parameter models and B300's 8 TB/s memory bandwidth is the bottleneck. R100's 22 TB/s HBM4 is 2.75x faster, and 50 PFLOPS FP4 compute is 3.33x B300. If your workload is memory-bandwidth-bound, R100 is the first GPU where bandwidth stops being the ceiling.
Pick B300 instead if
Your timeline is 2025 or early 2026. B300 ships now with 288GB HBM3e and 15 PFLOPS FP4. For most 200B–400B workloads, B300 is the practical choice today. R100 is worth waiting for if you have flexible timelines and need the bandwidth or compute ceiling.
Pick B200 instead if
Your model fits in 192GB and you want the most widely available Blackwell option at lower cost. B200 handles most 70B–200B workloads, has better spot pricing, and is available on Spheron today.
Pick R100 for NVL72 if
You're running trillion-parameter workloads that need rack-scale memory. 72 R100 GPUs in NVL72 configuration share 20.7TB of unified HBM4. No other architecture today keeps a 10T+ parameter model inside a single NVLink domain without multi-node sharding.
Available now on Spheron
R100 ships H2 2026. For workloads that need to run now, Blackwell and Hopper GPUs are available on Spheron with per-minute billing and no commitments.
Related resources
NVIDIA Rubin R100 GPU: Specs, Architecture, and What to Expect
Full breakdown of Rubin architecture, HBM4 specs, NVLink 6, and how R100 compares to Blackwell on paper.
Vera Rubin NVL72: The Rack-Scale AI System Explained
Deep-dive into NVL72 topology, 20.7TB unified memory, 260 TB/s NVLink fabric, and which workloads justify a full rack.
NVIDIA Rubin vs Blackwell vs Hopper: GPU Generation Comparison
How Rubin, Blackwell, and Hopper stack up across VRAM, bandwidth, and compute, plus guidance on which generation to run today.