Comparison

AWS H100 Pricing 2026: P5 Instance Cost vs Spheron and Neoclouds

Back to BlogWritten by Mitrasish, Co-founderMay 21, 2026
aws h100 pricingaws p5 pricingaws p5 h100 costp5 instance priceaws h100 hourly costaws nvidia h100 priceH100 GPU RentalGPU Cloud PricingNeocloud vs Hyperscaler
AWS H100 Pricing 2026: P5 Instance Cost vs Spheron and Neoclouds

AWS cut P5 instance prices by 44% in June 2025, which made headlines. But $6.88/hr per H100 SXM5 is still the ceiling, not the floor. That's the on-demand rate for a p5.48xlarge divided across its 8 GPUs ($55.04/hr total). Neoclouds offer the same NVIDIA hardware starting under $2/hr per GPU. For the full cross-provider picture, see our GPU cloud pricing comparison. For the hidden billing surprises beyond EC2 line items, see our AWS hidden cost breakdown.

This post breaks down every P5 variant, every billing mode, and does the math on when, if ever, AWS commitment discounts close the gap.

AWS P5 Instance Family Overview

AWS offers three P5 variants. They share the H100 architecture but differ on memory capacity and networking bandwidth:

InstanceGPU ModelGPU CountGPU VRAMHost RAMNetworkingRegions
p5.48xlargeH100 SXM5880 GB HBM32 TB400 Gbps EFAus-east-1, us-west-2, select others
p5e.48xlargeH100e8192 GB HBM3e2 TB400 Gbps EFAus-east-1 (limited)
p5en.48xlargeH100e8192 GB HBM3e2 TB800 Gbps+ EFAus-east-1 (limited)

All three require a service quota increase before you can launch them. New AWS accounts have a default P instance quota of 0 vCPUs. Each p5.48xlarge consumes 192 vCPUs, so the quota request has to cover at least that before anything runs. p5e and p5en are available in fewer regions and are harder to get approved. Budget 3-7 business days for quota approval with a written business justification.

The H100e in p5e and p5en variants has 192 GB HBM3e per GPU, more than twice the VRAM of the standard H100 SXM5. This matters for models that exceed 80 GB of weights (roughly 100B+ parameters at BF16). If your model fits comfortably in 80 GB, p5e offers no compute benefit and costs more.

AWS H100 On-Demand Pricing Per Hour

AWS publishes p5.48xlarge list pricing. P5e and p5en are available but AWS does not always surface their rates directly on standard EC2 pricing pages; contact AWS sales or check the EC2 console directly for current p5e/p5en figures.

InstanceOn-Demand $/hr1-Year Savings Plan $/hr3-Year Reserved $/hrPer-GPU (on-demand)Capacity Block
p5.48xlarge$55.04~$38.53~$24.77~$6.88Fixed block price (not hourly)
p5e.48xlargeNot listed publiclyNot listed publiclyNot listed publiclyContact AWSContact AWS
p5en.48xlargeNot listed publiclyNot listed publiclyNot listed publiclyContact AWSContact AWS

These are list prices for us-east-1. Other regions carry surcharges, typically 5-15% above the base rate.

AWS list prices change without notice. The prices above are based on 21 May 2026 and may have changed. Check the AWS EC2 pricing page for current rates.

Capacity Blocks are sold as a fixed total cost for a reserved time window (minimum 1 day), not a synthetic per-hour rate. They can sell out weeks in advance and are non-interruptible once purchased. For budgeting, treat them as fixed commitments, not flexible compute.

Spot Pricing Reality for P5 Instances

AWS P5 spot capacity exists on paper but is structurally scarce. AWS prioritizes on-demand and reserved purchasers for P5 allocation; spot pools fill only with what remains. In practice, P5 spot shows up infrequently and comes with high interruption rates due to persistent demand.

AWS spot, when it does appear, discounts roughly 44% from on-demand rates. That brings a p5.48xlarge to around $30.64/hr (~$3.83 per GPU). The problem is availability. Long training runs that get interrupted mid-job and don't have checkpointing set up lose all their work. For anything over a few hours, P5 spot is not a reliable cost reduction strategy.

Neocloud spot is different in practice. On Spheron, H100 SXM5 spot pricing starts at $1.66/hr per GPU, and it functions as a real billing tier with consistent availability rather than a lottery. That's a meaningful distinction for teams that need cost predictability.

Hidden Costs That Inflate the AWS P5 Bill

The $55.04/hr p5.48xlarge line item is not your actual cost. Four categories of add-ons reliably inflate AWS P5 bills:

EBS storage

P5 instances include NVMe-backed instance storage for scratch space, but your OS volume runs on EBS. A typical setup with a 500 GB root volume and a checkpoint volume adds $40-80/month in gp3 storage at $0.08/GB/month. If you keep snapshots or maintain multiple AMIs for different job types, that stacks.

Data egress

AWS charges $0.09/GB for data leaving to the internet (first 10 TB/month). A 100 GB model checkpoint downloaded 10 times costs $90 in egress alone. For teams pulling large datasets from S3 out to other regions or external systems, egress is often a bigger line item than they expect until the first bill arrives.

EFA and cross-AZ networking

EFA itself carries no extra charge when instances are in the same placement group and the same AZ. The cost shows up when you move data across AZs. Cross-AZ data transfer is $0.02/GB in each direction. For multi-node training runs with large gradient checkpoints or frequent parameter server communication across AZs, this adds up across millions of parameter transfers.

Support and quota friction

AWS Developer support is $29/month, often the minimum needed to get timely responses on quota increase requests. Business support starts at $100/month or 3% of monthly usage, whichever is higher. Enterprise support starts at $15,000/month. These are not direct compute costs, but teams that need fast quota escalations or help resolving P5 issues pay for them. Factor them in as operational overhead.

For a broader view of hyperscaler hidden costs across all instance types, see hyperscaler hidden costs compared.

Side-by-Side Cost Table: AWS P5 vs Spheron H100 vs Other Neoclouds

The same NVIDIA H100 SXM5 hardware is available across multiple providers at very different price points:

ProviderGPU ModelOn-Demand $/hr (per GPU)Spot $/hr (per GPU)Egress FeeMin Commitment
AWS p5.48xlargeH100 SXM5 80GB~$6.88~$3.83 (rare)$0.09/GBNone for OD; 1-day for Capacity Blocks
Spheron H100 SXM5H100 SXM5 80GB$2.64$1.66NonePer-minute billing
Lambda Labs H100 SXMH100 SXM5 80GB~$2.49N/ANoneOn-demand
RunPod H100 SXMH100 SXM5 80GB~$2.49AvailableNoneOn-demand

For live H100 SXM5 availability on Spheron, see H100 on Spheron.

Pricing fluctuates based on GPU availability. The prices above are based on 21 May 2026 and may have changed. Check current GPU pricing → for live rates.

AWS spot pricing for P5 is listed as a range because availability is inconsistent. The per-GPU price assumes division by 8 across the p5.48xlarge instance. Lambda Labs and RunPod rates are published rates as of 21 May 2026 and may have changed.

Break-Even Math: When Does an AWS Savings Plan Win?

Running an 8x H100 training job at 720 hours per month:

Billing Mode$/hr for 8 GPUsMonthly Cost (720 hrs)
AWS on-demand$55.04$39,628
AWS 1-year Savings Plan (~30% off)~$38.53~$27,741
AWS 3-year Reserved (~55% off)~$24.77~$17,834
Spheron on-demand ($2.64 x 8)$21.12$15,206
Spheron spot ($1.66 x 8)$13.28$9,562

Spheron on-demand at $2.64/hr per GPU comes to $15,206/month for an 8-GPU cluster running full time. AWS 3-year reserved comes to $17,834/month. Spheron is cheaper than AWS at its deepest commitment tier with no lock-in, and Spheron spot at $1.66/hr per GPU brings the same cluster down to $9,562/month.

The 3-year AWS reserved path only makes sense if you need AWS-specific services (SageMaker, Bedrock, EKS with EFA-optimized AMIs) and cannot replicate those workflows on standard infrastructure. For pure GPU compute, there is no time horizon where a 3-year AWS commitment beats Spheron on-demand math.

These figures use prices from 21 May 2026. Both Spheron and AWS rates fluctuate, so run the math fresh before making any long-term commitment decision.

Migration Checklist: Moving an AWS H100 Workload to Spheron

P5 workloads use EFA and NCCL for multi-GPU communication. Most of that transfers with a few configuration changes:

  • Replace ECR image pulls with Docker Hub or a self-hosted registry (no ECR dependency needed on Spheron)
  • Update NCCL environment variables: use NCCL_SOCKET_IFNAME=eth0 instead of the EFA-specific ens interface naming from AWS
  • Remove aws-ofi-nccl plugin requirement; Spheron InfiniBand clusters use standard NCCL over TCP/RDMA
  • Switch S3 data transfer to rclone or direct HTTP downloads; no AWS SDK authentication required for public datasets
  • Update checkpoint paths from s3:// URIs to local NVMe or your own object store endpoint
  • Verify CUDA driver version parity (Spheron runs CUDA 12.x; confirm your container tag matches)
  • Test per-GPU memory allocation before assuming portability: p5e and p5en have 192 GB per GPU, standard H100 SXM5 has 80 GB; if you're migrating from p5e, verify your model and KV cache fit in 80 GB

Most containerized training jobs that work on p5.48xlarge migrate in hours, not weeks. The steps above cover the most common blockers.

FAQ

What is the difference between AWS P5, P5e, and P5en instances?

p5.48xlarge ships 8x H100 SXM5 80GB GPUs with 400 Gbps EFA networking. p5e.48xlarge upgrades to H100e with 192GB HBM3e memory per GPU for larger model weights. p5en.48xlarge adds enhanced EFA bandwidth (800 Gbps+) for tightly coupled multi-node training. All three variants are available only in specific AWS regions and require service quota increases before launching.

What are AWS P5 Capacity Block minimums?

AWS Capacity Blocks for ML on P5 instances require a minimum reservation of 1 day and are sold in fixed-size blocks (typically 1, 2, 4, or 8 instances for specific durations). They must be purchased in advance and are not interruptible. Pricing is a fixed total for the block duration, not an hourly rate. Available blocks are listed in the EC2 console and can sell out weeks in advance.

How do I request AWS P5 service quota?

Open the Service Quotas console in your target region, search for "Running On-Demand P instances", and submit an increase request. AWS requires a business justification and approval typically takes 3-7 business days. New accounts have a default quota of 0 vCPUs for P instances. P5 instances use 192 vCPUs each, so a single p5.48xlarge requires a quota of at least 192.

Is AWS spot available for P5 instances?

Spot capacity for P5 instances is rarely available in practice. AWS allocates most P5 capacity to on-demand and reserved purchasers before releasing any remainder to spot pools. When spot does appear, interruption rates are high given persistent demand. Unlike neo-cloud providers where spot pricing is a reliable billing tier, AWS P5 spot should not be factored into workload cost planning.

How much cheaper is Spheron H100 compared to AWS P5?

AWS p5.48xlarge lists at approximately $6.88/hr per H100 SXM5 on-demand (after the June 2025 44% price cut). Spheron H100 SXM5 starts at $2.64/hr on-demand ($1.66/hr spot). For sustained 8-GPU training workloads, the annual cost difference runs into six figures when AWS hidden costs (EBS, egress, EFA data transfer) are included.


Running training jobs on AWS P5 and watching the cost compound? Spheron's H100 SXM5 instances run on the same NVIDIA hardware at a fraction of the P5 list price, with no savings plan lock-in and no egress fees.

Rent H100 on Spheron → | Compare GPU pricing → | Get started →

FAQ / 05

Frequently Asked Questions

p5.48xlarge ships 8x H100 SXM5 80GB GPUs with 400 Gbps EFA networking. p5e.48xlarge upgrades to H100e with 192GB HBM3e memory per GPU for larger model weights. p5en.48xlarge adds enhanced EFA bandwidth (800 Gbps+) for tightly coupled multi-node training. All three variants are available only in specific AWS regions and require service quota increases before launching.

AWS Capacity Blocks for ML on P5 instances require a minimum reservation of 1 day and are sold in fixed-size blocks (typically 1, 2, 4, or 8 instances for specific durations). They must be purchased in advance and are not interruptible. Pricing is a fixed total for the block duration, not an hourly rate. Available blocks are listed in the EC2 console and can sell out weeks in advance.

Open the Service Quotas console in your target region, search for 'Running On-Demand P instances', and submit an increase request. AWS requires a business justification and approval typically takes 3-7 business days. New accounts have a default quota of 0 vCPUs for P instances. P5 instances use 192 vCPUs each, so a single p5.48xlarge requires a quota of at least 192.

Spot capacity for P5 instances is rarely available in practice. AWS allocates most P5 capacity to on-demand and reserved purchasers before releasing any remainder to spot pools. When spot does appear, interruption rates are high given persistent demand. Unlike neo-cloud providers where spot pricing is a reliable billing tier, AWS P5 spot should not be factored into workload cost planning.

AWS p5.48xlarge lists at approximately $6.88/hr per H100 SXM5 on-demand (after the June 2025 44% price cut). Spheron H100 SXM5 starts at $2.64/hr on-demand ($1.66/hr spot). For sustained 8-GPU training workloads, the annual cost difference runs into six figures when AWS hidden costs (EBS, egress, EFA data transfer) are included.

Build what's next.

The most cost-effective platform for building, training, and scaling machine learning models-ready when you are.