GPU Rental: On-Demand NVIDIA H100, A100, B200, RTX 5090 from $0.58/hr
Rent NVIDIA GPUs by the minute with live pricing, bare-metal access, SSH root, and a dedicated IP. Deploy H100, H200, B200, B300, GH200, A100, L40S, RTX PRO 6000, RTX 5090, or RTX 4090 in under 2 minutes. No contracts, no warm-up charges, no hidden fees.
GPU rental pricing
Live per-GPU hourly pricing across the full Spheron catalog. Every rate is per-minute billed with no minimum commit and no warm-up charges. Reserved multi-GPU clusters get deeper discounts; talk to sales for quotes.
| GPU | VRAM | Architecture | Best for | Starting at | |
|---|---|---|---|---|---|
| RTX 4090 | 24 GB | Ada Lovelace | Dev, experimentation, small-model inference | $0.58/hr | Rent → |
| RTX 5090 | 32 GB | Blackwell | Budget inference, prototyping, single-GPU dev | $0.68/hr | Rent → |
| L40S | 48 GB | Ada Lovelace | Inference serving, video/vision, rendering | $0.69/hr | Rent → |
| A100 | 80 GB | Ampere | Fine-tuning, mid-scale training, stable inference | $0.72/hr | Rent → |
| RTX PRO 6000 | 96 GB | Blackwell | Production inference, rendering, visual workloads | $1.07/hr | Rent → |
| H100 | 80 GB | Hopper | LLM training, HPC, large-scale inference | $1.33/hr | Rent → |
| H200 | 141 GB | Hopper | Long-context LLM inference, 70B+ model serving | $1.56/hr | Rent → |
| GH200 | 96 GB | Grace Hopper | CPU-GPU coherent workloads, graph AI, vector search | $1.88/hr | Rent → |
| B200 | 192 GB | Blackwell | Large-model training, FP4/FP8 inference | $2.25/hr | Rent → |
| B300 | 288 GB | Blackwell Ultra | Frontier training, trillion-parameter models | $3.50/hr | Rent → |
Pricing updates from live inventory. Spot availability varies by region and time of day. See each GPU page for per-minute rates, multi-GPU node pricing, and InfiniBand cluster options.
How GPU rental works on Spheron
Vetted data center capacity, exposed as VMs or bare-metal instances. You pick, you deploy, you pay for minutes used. No approval queue, no warm-up billing, no hypervisor tax.
Dedicated vs spot
Both on-demand. Both bare-metal. Per-minute billed.Runs until you stop it. The provider cannot reclaim the node, so you pay a fixed hourly rate and keep the instance as long as you need it.
- ·Production inference endpoints
- ·Interactive development
- ·Long-running training
Runs on spare capacity at a deep discount. Interruptible when the provider reclaims that capacity. Each GPU page posts both rates so you can compare before you launch.
- ·Checkpointable training runs
- ·Batch inference
- ·Hyperparameter sweeps
Per-minute billing
Billing starts when your instance reports ready and stops the moment you terminate. No minimum run time, no rounding up to the hour, no charge for boot time.
Multi-GPU & interconnect
Single to 8x nodes with NVLink on H100, H200, B200, B300, and A100. Beyond 8 GPUs, InfiniBand clusters with RDMA and NCCL-tuned topology.
Bare-metal access
- ·SSH root, dedicated public IP
- ·Ubuntu 22.04 + CUDA preinstalled
- ·Docker + NVIDIA Container Toolkit
- ·No hypervisor overhead, no noisy neighbors
Ready to Deploy?
Deploy enterprise-grade GPU instances in minutes with instant provisioning and bare-metal performance. No contracts, no commitments, no hidden fees, pay only for what you use.