Question 1

What is the cheapest GPU to rent on Spheron?

Accepted Answer

The RTX 4090 is the cheapest GPU on Spheron, starting at $0.53/hr. It carries 24 GB GDDR6X VRAM and is a good fit for development, small-model inference (up to 13B parameters in FP16), and prototyping. For serious inference on 70B models you want more VRAM, which means an A100 80GB at roughly $1.48/hr or above. All prices are per-minute with no minimum commitment.

Question 2

How do I pick the right GPU for my workload?

Accepted Answer

Start from VRAM. For LLM inference you need roughly 2x the model's parameter count in GB for FP16 weights (a 70B model needs ~140 GB, so H200 or multi-GPU H100). For training, memory pressure doubles from gradients and optimizer state. Then pick on compute: H100 and B200 lead on training throughput, H200 leads on long-context inference (141 GB HBM3e), A100 is the cost-effective middle ground, RTX 5090 and 4090 handle dev and small-model serving. For HPC and FP64 workloads, A100 and H100 are the right targets.

Question 3

What's the difference between spot and dedicated GPU rental on Spheron?

Accepted Answer

On Spheron, on-demand rental comes in two tiers: dedicated and spot. Dedicated instances come with a 99.99% SLA and cannot be reclaimed by the provider; you pay a fixed hourly rate and the instance runs until you stop it. Spot instances run on spare capacity at a lower rate (typically 30-60% cheaper) but are interruptible when the capacity is reclaimed. Use spot for checkpoint-friendly training, batch inference, and fault-tolerant jobs. Use dedicated for production serving and any workload where interruption is expensive. Both tiers are bare-metal with per-minute billing; each GPU page posts both rates side by side.

Question 4

Is there a minimum commitment to rent a GPU?

Accepted Answer

No. Spheron bills per minute with no minimum term and no long-term contracts. You can stop an instance at any time and billing stops immediately. For large reserved clusters (8+ GPUs for weeks or months), Spheron offers discounted reserved rates, but none of the standard rental tiers require a commit.

Question 5

Which GPUs are available for rent on Spheron?

Accepted Answer

Spheron offers NVIDIA B300, H100, B200, H200, GH200, A100 (80GB), L40S, RTX PRO 6000 Blackwell, RTX 5090, and RTX 4090. Every GPU is enterprise-grade with dedicated VRAM, bare-metal VM access, SSH root, and a dedicated IP. Multi-GPU configurations up to 8x per node are available for most SKUs with NVLink and InfiniBand interconnects.

Question 6

How quickly can I deploy a GPU instance?

Accepted Answer

Less than 2 minutes from click to SSH, with a dedicated IP. No approval queue, no support tickets, no warm-up charges.

Question 7

How does Spheron pricing compare to hyperscalers?

Accepted Answer

Spheron is typically 50-70% cheaper than AWS, Azure, and GCP on-demand rates for the same GPU SKU. For example, an H200 on AWS (p5e) is roughly $4.98/hr and on GCP (a3-ultragpu) is $10.87/hr, against Spheron's on-demand rate of about $4.54/hr with per-minute billing. The savings come from sourcing capacity directly from vetted data center partners rather than through the hyperscaler retail margin.

Question 8

Does Spheron provide bare-metal GPU access?

Accepted Answer

Yes. Spheron provides full VM and bare-metal GPU access with root SSH access and dedicated IPs. You get 100% of the hardware performance with no hypervisor overhead and no noisy neighbors, which matters for low-latency inference, tight-loop training, and profiling work.

GPU	VRAM	Architecture	Best for	Starting at
R100	288 GB	Rubin	Trillion-parameter FP4 inference, NVL72 rack-scale training	Ships H2 2026	Reserve → R100
GB300	288 GB	Blackwell Ultra	NVL72 rack-scale training, trillion-param inference	Custom quote	Reserve → GB300
GB200	192 GB	Blackwell	NVL72 inference clusters, multi-trillion-param serving	Custom quote	Reserve → GB200
RTX 4090	24 GB	Ada Lovelace	Dev, experimentation, small-model inference	$0.53/hr	Rent → RTX 4090
RTX 5090	32 GB	Blackwell	Budget inference, prototyping, single-GPU dev	$0.86/hr	Rent → RTX 5090
L40S	48 GB	Ada Lovelace	Inference serving, video/vision, rendering	$0.96/hr	Rent → L40S
A100	80 GB	Ampere	Fine-tuning, mid-scale training, stable inference	$1.48/hr	Rent → A100
H100	80 GB	Hopper	LLM training, HPC, large-scale inference	$2.01/hr	Rent → H100
RTX PRO 6000	96 GB	Blackwell	Production inference, rendering, visual workloads	$2.29/hr	Rent → RTX PRO 6000
GH200	96 GB	Grace Hopper	CPU-GPU coherent workloads, graph AI, vector search	$3.02/hr	Rent → GH200
H200	141 GB	Hopper	Long-context LLM inference, 70B+ model serving	$4.22/hr	Rent → H200
B200	192 GB	Blackwell	Large-model training, FP4/FP8 inference	$9.36/hr	Rent → B200
B300	262 GB	Blackwell Ultra	Frontier training, trillion-parameter models	$10.21/hr	Rent → B300

GPU Rental 2026: Rent NVIDIA H100, H200, B200, B300, A100 & RTX 5090 GPUs from $0.53/hr

GPU rental pricing

How GPU rental works on Spheron

Dedicated vs spot

Per-minute billing

Multi-GPU & interconnect

Bare-metal access

Ready to Deploy?