Which GPU cloud provider is cheapest for H100 rental in 2026?

Spheron offers H100 SXM5 at approximately $0.99/hr spot and $2.01/hr on-demand as of March 2026, making it among the lowest H100 rates available. Lambda Labs lists H100 SXM at $2.49-$3.44/hr depending on configuration, AWS at ~$6.88/hr, and Azure at ~$12.29/hr (per GPU on the ND96isr H100 v5 instance). Spot pricing on Spheron reduces costs further by around 50% for fault-tolerant workloads.

What is the cheapest way to rent a B200 GPU in 2026?

Spheron spot pricing for B200 starts at $2.18/hr as of March 2026, compared to RunPod at $4.99/hr (Secure Cloud), Nebius at $5.50/hr, Lambda Labs at $4.99-$5.29/hr depending on configuration, and AWS at approximately $14.24/hr on-demand. Spot instances are interruptible but provide the lowest per-hour rate for batch training and offline inference workloads.

What hidden costs should I watch for when comparing GPU cloud pricing?

The most common hidden costs are: egress bandwidth fees ($0.08-$0.12/GB on hyperscalers, free or flat on most neo-clouds), persistent storage ($0.08-$0.15/GB/month), minimum rental commitments (some providers require 1-hour or 1-day minimums), and network/IP address fees. AWS, GCP, and Azure typically charge $0.08-$0.12/GB for data egress, which can exceed the GPU cost for large model checkpoints.

Is spot pricing worth it for GPU cloud workloads?

Spot pricing is worth it for batch training jobs, offline inference, and any workload that can checkpoint and resume after interruption. Savings range from 40-65% below on-demand rates. Avoid spot for production inference APIs, real-time serving, or jobs without checkpointing. Most GPU cloud providers, including Spheron, RunPod, and Vast.ai, offer spot instances.

GPU Cloud Pricing Comparison 2026: Every Major Provider Side by Side

Q: How does reserved GPU pricing work?

Reserved GPU pricing (also called committed-use or contract pricing) typically requires 1-month to 12-month commitments in exchange for 20-40% discounts vs on-demand rates. AWS EC2 reserved instances, Azure reserved VMs, and GCP committed-use contracts all follow this model. Neo-cloud providers like Lambda Labs and CoreWeave offer reserved clusters at negotiated rates. Spheron offers volume discounts via direct contact for longer commitments.

Hyperscalers charge 3-6x more than neo-cloud alternatives for the same GPU hardware. AWS H100 on-demand runs ~$6.88/hr. Azure charges ~$12.29/hr per GPU on their ND H100 v5 instances. On Spheron, the same H100 SXM5 is $2.01/hr on-demand and $0.99/hr on spot. That gap is not a temporary anomaly. It reflects structural differences in overhead, margin, and business model.

This post covers 7 GPU models across 15+ providers, with on-demand, spot, and reserved pricing for each. You can check Spheron's current GPU pricing for live rates. For throughput data behind these prices, see our GPU cloud benchmarks.

The GPU Models Covered

GPU Model	VRAM	Primary Use Case	Tier
RTX 4090	24 GB GDDR6X	Hobbyist inference, fine-tuning	Consumer
A100 80GB	80 GB HBM2e	Training, inference	Data center
L40S	48 GB GDDR6	Inference, rendering	Data center
H100 SXM5	80 GB HBM3	Production training	Data center
H200 SXM	141 GB HBM3e	Large model inference	Data center
B200	192 GB HBM3e	Frontier inference	Blackwell
RTX 5090	32 GB GDDR7	Consumer inference	Consumer

GPU Cloud Pricing by Model (March 2026)

All prices as of March 19, 2026, based on publicly available on-demand rates. Prices fluctuate based on GPU availability and provider policies. Check current Spheron GPU pricing for live rates.