Spheron GPU Catalog

Coming H2 2026

PipelinePre-Order Open

Limited Slots

NVIDIA Rubin R100
(H300) GPU

Q: When will the NVIDIA Rubin R100 (H300) be available on Spheron?

NVIDIA confirmed H2 2026 availability for the first cloud cohort (AWS, Google Cloud, Azure, CoreWeave, Lambda, Nebius, Nscale). Spheron is targeting broader availability in 2027 as supply scales. Register your interest now to be notified when R100 capacity goes live.

Q: What are the NVIDIA R100 GPU specs?

The R100 has 288GB HBM4 memory at up to 22 TB/s bandwidth, 50 PFLOPS FP4 compute, 336 billion transistors, NVLink 6 at 3.6 TB/s interconnect, and ConnectX-9 at 1.6T networking. FP8 throughput is approximately 16,000 TFLOPS per GPU. It is the generational successor to the B300 (Blackwell Ultra).

Q: Is there any R100 pricing available?

No official pricing has been published. At launch, hyperscalers typically price next-gen GPUs at a 30–50% premium over the prior generation. Specialist GPU clouds like Spheron historically run lower than that. Register your interest and we'll send you R100 pricing the moment it's confirmed.

Q: How does R100 compare to B300?

R100 outpaces B300 on every axis except capacity. Memory: 288GB HBM4 vs 288GB HBM3e (same size, but HBM4 hits up to 22 TB/s vs 8 TB/s). Compute: 50 PFLOPS vs 15 PFLOPS FP4. Interconnect: NVLink 6 at 3.6 TB/s vs NVLink 5 at 1.8 TB/s. For compute-dense FP4 inference on 400B+ parameter models, R100 delivers roughly 3.3x the throughput of B300.

Q: What is the difference between R100, R200, and H300?

These are different names for the same hardware. NVIDIA officially calls it the 'Rubin GPU'. R100 was the early industry speculation following prior-generation naming (H100, B200). R200 emerged later reflecting the dual-die package design. Cloud providers (AWS, Google, Azure) are branding it H300, consistent with their H100/H200 naming conventions. All refer to the same Rubin-architecture GPU.

288 GB HBM4 · 22 TB/s · 50 PFLOPS FP4

The Rubin R100 is the generational successor to B300, built for trillion-parameter inference at FP4 precision and multi-node training runs where Blackwell memory bandwidth becomes the bottleneck.

No pricing published yet. Cloud providers start shipping R100 (also called H300) in H2 2026. Register your interest to be notified first when Spheron capacity opens.

VRAM288 GB HBM4

Bandwidth22 TB/s

FP4 Compute50 PFLOPS

R100 chip specs guide →Vera Rubin NVL72 system guide →

Available now while you wait

View all GPU pricing →

Register Interest

Join the pipeline for R100 access. We'll reach out with pricing and availability as soon as capacity goes live.

First Name *

Last Name *

Company *

Job Title

Work Email *

Phone *

GPU Count *

Heard via *

Intended Workload *

I agree to Spheron's privacy policy and to be contacted about R100 availability and Spheron GPU offerings.

At a glance

The NVIDIA Rubin R100 (also branded H300 by cloud providers) is the generational successor to B300 Blackwell Ultra. It ships with 288GB HBM4 at up to 22 TB/s bandwidth (2.75x faster than B300), 50 PFLOPS FP4 compute (3.33x B300), NVLink 6 at 3.6 TB/s per GPU, and ConnectX-9 networking. First cloud availability is H2 2026 for AWS, Google Cloud, Azure, and specialist providers. Spheron is onboarding R100 capacity and will contact registered teams with pricing as soon as it is confirmed. For workloads running today, B300 and B200 are available now.

NVIDIA GPU generation roadmap

Where R100 sits in the NVIDIA GPU generation stack. All generations prior to Rubin are available on Spheron today.

Rubin2026

R100

288 GB HBM4

Coming

R100 GPU specifications

Architecture

NVIDIA Rubin

VRAM

288 GB HBM4

Memory Bandwidth

Up to 22 TB/s

FP4 Throughput

50 PFLOPS

FP8 Throughput

~16,000 TFLOPS (est.)

Transistors

336 billion

Interconnect

NVLink 6 @ 3.6 TB/s

Networking

ConnectX-9 (1.6T)

TDP

~2,300 W

Memory Type

HBM4

R100 specs sourced from NVIDIA GTC 2025 roadmap and confirmed at CES 2026 and GTC 2026. Memory bandwidth (up to 22 TB/s), NVLink 6, and ConnectX-9 are officially confirmed. FP8 throughput (~16,000 TFLOPS) is derived from NVIDIA's NVL72 system-level spec. Final per-GPU specs will be available when cloud providers ship production systems in H2 2026.

R100 vs B300 vs B200 vs H100

Spec	R1002026	B300	B200	H100
Architecture	Rubin	Blackwell Ultra	Blackwell	Hopper
VRAM	288 GB HBM4	288 GB HBM3e	192 GB HBM3e	80 GB HBM3
Memory Bandwidth	Up to 22 TB/s	8 TB/s	8 TB/s	3.35 TB/s
FP4 Compute	50 PFLOPS	15 PFLOPS	9 PFLOPS	N/A
FP8 Throughput	~16,000 TFLOPS	7,000 TFLOPS	4,500 TFLOPS	~2,000 TFLOPS
Interconnect	NVLink 6 (3.6 TB/s)	NVLink 5 (1.8 TB/s)	NVLink 5 (1.8 TB/s)	NVLink 4 (900 GB/s)
Transistors	336 billion	208 billion	208 billion	80 billion
Cloud Availability	H2 2026 (first cohort)	Available now	Available now	Available now

R100 FP8 throughput (~16,000 TFLOPS) derived from NVIDIA's NVL72 system-level spec. All other R100 specs confirmed at CES 2026 and GTC 2026. Availability reflects first cohort; broader availability from additional providers follows.

Workloads built for R100

Use case / 01

Optimized

⚡

Trillion-Parameter FP4 Inference

At 50 PFLOPS FP4 and 288GB HBM4, a single R100 can hold and serve a 200B parameter model in FP4 with 88GB of headroom for KV cache. Multi-GPU setups handle 400B+ models that currently require 4x B300 at FP8.

Frontier MoE models (400B+ at FP4) on fewer GPUsLong-context RAG at 1M+ tokens with HBM4 bandwidthDisaggregated prefill/decode with NVLink 6Real-time agentic inference for 70B–400B models

Use case / 02

Optimized

🔬

Frontier Model Pre-Training

3.33x the FP4 compute of B300 and 2.75x the memory bandwidth. Training runs that require 8x B300 nodes may fit on fewer R100 nodes, reducing inter-node communication overhead and wall time.

Trillion-parameter dense and MoE pre-trainingMulti-modal foundation models (text, image, video, audio)Reinforcement learning from human feedback at scaleLong-sequence transformer training (1M+ token context)

Use case / 03

Optimized

🖥️

Rack-Scale NVL72 Workloads

In the Vera Rubin NVL72 configuration, 72 R100 GPUs share 260 TB/s NVLink fabric and 20.7TB of aggregate HBM4. Models that require multi-node sharding on Blackwell may fit in a single NVL72 rack.

10T+ parameter models in a single NVLink domainMixture-of-Experts with billions of total parametersScientific simulation: molecular dynamics, climate, physicsAutonomous AI agent orchestration at scale

Use case / 04

Optimized

🚀

High-Bandwidth Inference Serving

22 TB/s memory bandwidth is 2.75x B300 on a single GPU. Decode-phase throughput scales nearly linearly with bandwidth for memory-bound LLM serving. At equivalent batch sizes, R100 serves 2.5–3x more tokens per second than B300.

High-concurrency API serving for frontier-scale LLMsReal-time video and image generation at 4K/8KCode generation agents with 200K+ context windowsBatch inference pipelines for large-scale data processing

When to pick the R100

Scenario 01

Pick R100 if

You're training or serving 400B+ parameter models and B300's 8 TB/s memory bandwidth is the bottleneck. R100's 22 TB/s HBM4 is 2.75x faster, and 50 PFLOPS FP4 compute is 3.33x B300. If your workload is memory-bandwidth-bound, R100 is the first GPU where bandwidth stops being the ceiling.

Recommended fit

Scenario 02

Pick B300 instead if

Your timeline is 2025 or early 2026. B300 ships now with 288GB HBM3e and 15 PFLOPS FP4. For most 200B–400B workloads, B300 is the practical choice today. R100 is worth waiting for if you have flexible timelines and need the bandwidth or compute ceiling.

Recommended fit

Scenario 03

Pick B200 instead if

Your model fits in 192GB and you want the most widely available Blackwell option at lower cost. B200 handles most 70B–200B workloads, has better spot pricing, and is available on Spheron today.

Recommended fit

Scenario 04

Pick R100 for NVL72 if

You're running trillion-parameter workloads that need rack-scale memory. 72 R100 GPUs in NVL72 configuration share 20.7TB of unified HBM4. No other architecture today keeps a 10T+ parameter model inside a single NVLink domain without multi-node sharding.

Recommended fit

Available now on Spheron

R100 ships H2 2026. For workloads that need to run now, Blackwell and Hopper GPUs are available on Spheron with per-minute billing and no commitments.

B300288 GB HBM3e

Blackwell Ultra. Same VRAM as R100, available now.

From$10.18/hrlive

Rent B300 →

B200192 GB HBM3e

Blackwell. Most workloads under 200B parameters.

From$3.77/hrlive

Rent B200 →

H200141 GB HBM3e

Hopper. Best value for inference and long-context serving.

From$1.92/hrlive

Rent H200 →