AWS H100 Pricing 2026: P5 Instance Cost vs Spheron and Neoclouds

AWS cut P5 instance prices by 44% in June 2025, which made headlines. But $6.88/hr per H100 SXM5 is still the ceiling, not the floor. That's the on-demand rate for a p5.48xlarge divided across its 8 GPUs ($55.04/hr total). Neoclouds offer the same NVIDIA hardware starting under $2/hr per GPU. For the full cross-provider picture, see our GPU cloud pricing comparison. For the hidden billing surprises beyond EC2 line items, see our AWS hidden cost breakdown. For a broader view of H100 pricing trends across providers, see the H100 news and pricing hub.

This post breaks down every P5 variant, every billing mode, and does the math on when, if ever, AWS commitment discounts close the gap. If you are still pre-revenue, several providers will front you the compute: the free GPU cloud credits guide ranks every 2026 program by the real H100 hours it buys.

AWS P5 Instance Family Overview

AWS offers three P5 variants. They share the H100 architecture but differ on memory capacity and networking bandwidth:

Instance	GPU Model	GPU Count	GPU VRAM	Host RAM	Networking	Regions
p5.48xlarge	H100 SXM5	8	80 GB HBM3	2 TB	400 Gbps EFA	us-east-1, us-west-2, select others
p5e.48xlarge	H100e	8	192 GB HBM3e	2 TB	400 Gbps EFA	us-east-1 (limited)
p5en.48xlarge	H100e	8	192 GB HBM3e	2 TB	800 Gbps+ EFA	us-east-1 (limited)

All three require a service quota increase before you can launch them. New AWS accounts have a default P instance quota of 0 vCPUs. Each p5.48xlarge consumes 192 vCPUs, so the quota request has to cover at least that before anything runs. p5e and p5en are available in fewer regions and are harder to get approved. Budget 3-7 business days for quota approval with a written business justification.

The H100e in p5e and p5en variants has 192 GB HBM3e per GPU, more than twice the VRAM of the standard H100 SXM5. This matters for models that exceed 80 GB of weights (roughly 100B+ parameters at BF16). If your model fits comfortably in 80 GB, p5e offers no compute benefit and costs more.

AWS H100 On-Demand Pricing Per Hour

AWS publishes p5.48xlarge list pricing. P5e and p5en are available but AWS does not always surface their rates directly on standard EC2 pricing pages; contact AWS sales or check the EC2 console directly for current p5e/p5en figures.

Instance	On-Demand $/hr	1-Year Savings Plan $/hr	3-Year Reserved $/hr	Per-GPU (on-demand)	Capacity Block
p5.48xlarge	$55.04	~$38.53	~$24.77	~$6.88	Fixed block price (not hourly)
p5e.48xlarge	Not listed publicly	Not listed publicly	Not listed publicly	Contact AWS	Contact AWS
p5en.48xlarge	Not listed publicly	Not listed publicly	Not listed publicly	Contact AWS	Contact AWS

These are list prices for us-east-1. Other regions carry surcharges, typically 5-15% above the base rate.

AWS list prices change without notice. The prices above are based on 21 May 2026 and may have changed. Check the AWS EC2 pricing page for current rates.

Capacity Blocks are sold as a fixed total cost for a reserved time window (minimum 1 day), not a synthetic per-hour rate. They can sell out weeks in advance and are non-interruptible once purchased. For budgeting, treat them as fixed commitments, not flexible compute.

Spot Pricing Reality for P5 Instances

AWS P5 spot capacity exists on paper but is structurally scarce. AWS prioritizes on-demand and reserved purchasers for P5 allocation; spot pools fill only with what remains. In practice, P5 spot shows up infrequently and comes with high interruption rates due to persistent demand.

AWS spot, when it does appear, discounts roughly 44% from on-demand rates. That brings a p5.48xlarge to around $30.64/hr (~$3.83 per GPU). The problem is availability. Long training runs that get interrupted mid-job and don't have checkpointing set up lose all their work. For anything over a few hours, P5 spot is not a reliable cost reduction strategy.

Neocloud spot is different in practice. On Spheron, H100 SXM5 spot pricing starts at $1.66/hr per GPU, and it functions as a real billing tier with consistent availability rather than a lottery. That's a meaningful distinction for teams that need cost predictability. If you want to skip the AWS quota dance entirely, you can rent GPUs on Spheron with the same H100 hardware and no service ticket.

Hidden Costs That Inflate the AWS P5 Bill

The $55.04/hr p5.48xlarge line item is not your actual cost. The same billing patterns apply to AWS's mid-tier Blackwell inference instances: for the full cost breakdown on G7 with RTX PRO 4500, see the EC2 G7 pricing breakdown where these same EBS, egress, and support costs are modeled for inference workloads. Four categories of add-ons reliably inflate AWS P5 bills:

EBS storage

P5 instances include NVMe-backed instance storage for scratch space, but your OS volume runs on EBS. A typical setup with a 500 GB root volume and a checkpoint volume adds $40-80/month in gp3 storage at $0.08/GB/month. If you keep snapshots or maintain multiple AMIs for different job types, that stacks.

Data egress

AWS charges $0.09/GB for data leaving to the internet (first 10 TB/month). A 100 GB model checkpoint downloaded 10 times costs $90 in egress alone. For teams pulling large datasets from S3 out to other regions or external systems, egress is often a bigger line item than they expect until the first bill arrives.

EFA and cross-AZ networking

EFA itself carries no extra charge when instances are in the same placement group and the same AZ. The cost shows up when you move data across AZs. Cross-AZ data transfer is $0.02/GB in each direction. For multi-node training runs with large gradient checkpoints or frequent parameter server communication across AZs, this adds up across millions of parameter transfers.

Support and quota friction

AWS Developer support is $29/month, often the minimum needed to get timely responses on quota increase requests. Business support starts at $100/month or 3% of monthly usage, whichever is higher. Enterprise support starts at $15,000/month. These are not direct compute costs, but teams that need fast quota escalations or help resolving P5 issues pay for them. Factor them in as operational overhead.

For a broader view of hyperscaler hidden costs across all instance types, see hyperscaler hidden costs compared.

Side-by-Side Cost Table: AWS P5 vs Spheron H100 vs Other Neoclouds

The same NVIDIA H100 SXM5 hardware is available across multiple providers at very different price points:

Provider	GPU Model	On-Demand $/hr (per GPU)	Spot $/hr (per GPU)	Egress Fee	Min Commitment
AWS p5.48xlarge	H100 SXM5 80GB	~$6.88	~$3.83 (rare)	$0.09/GB	None for OD; 1-day for Capacity Blocks
Spheron H100 SXM5	H100 SXM5 80GB	$2.64	$1.66	None	Per-minute billing
Lambda Labs H100 SXM	H100 SXM5 80GB	~$2.49	N/A	None	On-demand
RunPod H100 SXM	H100 SXM5 80GB	~$3.29	Available	None	On-demand

For live H100 SXM5 availability on Spheron, see H100 on Spheron.

Pricing fluctuates based on GPU availability. The prices above are based on 21 May 2026 and may have changed. Check current GPU pricing → for live rates.

AWS spot pricing for P5 is listed as a range because availability is inconsistent. The per-GPU price assumes division by 8 across the p5.48xlarge instance. Lambda Labs and RunPod rates are published rates as of 21 May 2026 and may have changed. Oracle Cloud charges a flat $10/hr per H100 GPU on its BM.GPU.H100.8 bare metal shape; see the OCI H100 pricing breakdown for the full analysis including preemptible rates and Universal Credits math.

Break-Even Math: When Does an AWS Savings Plan Win?

Running an 8x H100 training job at 720 hours per month:

Billing Mode	$/hr for 8 GPUs	Monthly Cost (720 hrs)
AWS on-demand	$55.04	$39,628
AWS 1-year Savings Plan (~30% off)	~$38.53	~$27,741
AWS 3-year Reserved (~55% off)	~$24.77	~$17,834
Spheron on-demand ($2.64 x 8)	$21.12	$15,206
Spheron spot ($1.66 x 8)	$13.28	$9,562

Spheron on-demand at $2.64/hr per GPU comes to $15,206/month for an 8-GPU cluster running full time. AWS 3-year reserved comes to $17,834/month. Spheron is cheaper than AWS at its deepest commitment tier with no lock-in, and Spheron spot at $1.66/hr per GPU brings the same cluster down to $9,562/month.

The 3-year AWS reserved path only makes sense if you need AWS-specific services (SageMaker, Bedrock, EKS with EFA-optimized AMIs) and cannot replicate those workflows on standard infrastructure. For pure GPU compute, there is no time horizon where a 3-year AWS commitment beats Spheron on-demand math.

These figures use prices from 21 May 2026. Both Spheron and AWS rates fluctuate, so run the math fresh before making any long-term commitment decision.

Migration Checklist: Moving an AWS H100 Workload to Spheron

P5 workloads use EFA and NCCL for multi-GPU communication. Most of that transfers with a few configuration changes:

Replace ECR image pulls with Docker Hub or a self-hosted registry (no ECR dependency needed on Spheron)
Update NCCL environment variables: use NCCL_SOCKET_IFNAME=eth0 instead of the EFA-specific ens interface naming from AWS
Remove aws-ofi-nccl plugin requirement; Spheron InfiniBand clusters use standard NCCL over TCP/RDMA
Switch S3 data transfer to rclone or direct HTTP downloads; no AWS SDK authentication required for public datasets
Update checkpoint paths from s3:// URIs to local NVMe or your own object store endpoint
Verify CUDA driver version parity (Spheron runs CUDA 12.x; confirm your container tag matches)
Test per-GPU memory allocation before assuming portability: p5e and p5en have 192 GB per GPU, standard H100 SXM5 has 80 GB; if you're migrating from p5e, verify your model and KV cache fit in 80 GB

Most containerized training jobs that work on p5.48xlarge migrate in hours, not weeks. The steps above cover the most common blockers.

For teams evaluating AWS Trainium 3 (EC2 Trn3) as a cheaper AWS alternative to P5 H100 instances, the migration is a different story entirely: Trainium requires the Neuron SDK and is incompatible with vLLM, TensorRT-LLM, and custom CUDA kernels. See the Trainium 3 vs H200 cost-per-token breakdown for the full cost math including the Neuron SDK migration tax in the total cost calculation.

FAQ

What is the difference between AWS P5, P5e, and P5en instances?

p5.48xlarge ships 8x H100 SXM5 80GB GPUs with 400 Gbps EFA networking. p5e.48xlarge upgrades to H100e with 192GB HBM3e memory per GPU for larger model weights. p5en.48xlarge adds enhanced EFA bandwidth (800 Gbps+) for tightly coupled multi-node training. All three variants are available only in specific AWS regions and require service quota increases before launching.

What are AWS P5 Capacity Block minimums?

AWS Capacity Blocks for ML on P5 instances require a minimum reservation of 1 day and are sold in fixed-size blocks (typically 1, 2, 4, or 8 instances for specific durations). They must be purchased in advance and are not interruptible. Pricing is a fixed total for the block duration, not an hourly rate. Available blocks are listed in the EC2 console and can sell out weeks in advance.

How do I request AWS P5 service quota?

Open the Service Quotas console in your target region, search for "Running On-Demand P instances", and submit an increase request. AWS requires a business justification and approval typically takes 3-7 business days. New accounts have a default quota of 0 vCPUs for P instances. P5 instances use 192 vCPUs each, so a single p5.48xlarge requires a quota of at least 192.

Is AWS spot available for P5 instances?

Spot capacity for P5 instances is rarely available in practice. AWS allocates most P5 capacity to on-demand and reserved purchasers before releasing any remainder to spot pools. When spot does appear, interruption rates are high given persistent demand. Unlike neo-cloud providers where spot pricing is a reliable billing tier, AWS P5 spot should not be factored into workload cost planning.

How much cheaper is Spheron H100 compared to AWS P5?

AWS p5.48xlarge lists at approximately $6.88/hr per H100 SXM5 on-demand (after the June 2025 44% price cut). Spheron H100 SXM5 starts at $2.64/hr on-demand ($1.66/hr spot). For sustained 8-GPU training workloads, the annual cost difference runs into six figures when AWS hidden costs (EBS, egress, EFA data transfer) are included.

Running training jobs on AWS P5 and watching the cost compound? Spheron's H100 SXM5 instances run on the same NVIDIA hardware at a fraction of the P5 list price, with no savings plan lock-in and no egress fees.
H100 GPU on Spheron → | Compare GPU pricing → | Get started →

FAQ / 05

Frequently Asked Questions

Open the Service Quotas console in your target region, search for 'Running On-Demand P instances', and submit an increase request. AWS requires a business justification and approval typically takes 3-7 business days. New accounts have a default quota of 0 vCPUs for P instances. P5 instances use 192 vCPUs each, so a single p5.48xlarge requires a quota of at least 192.

AWS P5 Instance Family Overview

AWS H100 On-Demand Pricing Per Hour

Spot Pricing Reality for P5 Instances

Hidden Costs That Inflate the AWS P5 Bill

Side-by-Side Cost Table: AWS P5 vs Spheron H100 vs Other Neoclouds

Break-Even Math: When Does an AWS Savings Plan Win?

Migration Checklist: Moving an AWS H100 Workload to Spheron

FAQ

What is the difference between AWS P5, P5e, and P5en instances?

What are AWS P5 Capacity Block minimums?

How do I request AWS P5 service quota?

Is AWS spot available for P5 instances?

How much cheaper is Spheron H100 compared to AWS P5?

Frequently Asked Questions

01What is the difference between AWS P5, P5e, and P5en instances?

02What are AWS P5 Capacity Block minimums?

03How do I request AWS P5 service quota?

04Is AWS spot available for P5 instances?

05How much cheaper is Spheron H100 compared to AWS P5?

Build what's next.