Google Cloud A3 H100 Pricing 2026: GCP Per-Hour Cost vs Spheron Breakdown

TL;DR: GCP A3 H100 vs Spheron H100 (22 May 2026)

Metric	GCP A3 High (H100)	Spheron H100 SXM5
On-demand $/hr per GPU	~$10.98	$2.64
Spot $/hr per GPU	~$3.69	$1.66
1yr CUD $/hr per GPU	~$8.78	N/A
3yr CUD $/hr per GPU	~$5.93	N/A
Monthly TCO (8xH100, 720 hr)	~$63,302	~$15,206 (on-demand) / ~$9,562 (spot)
Egress fees	$0.08-$0.12/GB	None
Persistent disk	Separate billing	Included
Minimum commitment	None (on-demand)	None

Pricing fluctuates based on GPU availability. The prices above are based on 22 May 2026 and may have changed. Check current GPU pricing → for live rates. GCP rates are from the public GCP pricing page as of the same date; verify with the GCP pricing calculator before budgeting.

GCP's A3 instances give you H100 SXM5 GPUs with the full Google Cloud stack attached. If your team is already deep in Vertex AI or BigQuery ML, that stack has real value. If not, you're paying a 6x premium for the same CUDA hardware. This post covers every pricing tier for GCP A3, including the costs that don't appear in the headline hourly rate, and compares them against Spheron's current H100 rates.

For a broader comparison across 5+ providers, see our GPU cloud pricing comparison 2026.

GCP A3 Instance Family Overview

The A3 series is Google Cloud's H100-based GPU family. Google's newer Blackwell line, the A4 series (8x NVIDIA B200), is a separate machine family with its own pricing and reservation model; see our Google Cloud A4 B200 pricing breakdown if you're deciding between the two generations. Three A3 configurations exist:

Instance	GPUs	vCPUs	RAM	Local SSD	Network	Primary Use
a3-highgpu-8g	8x H100 SXM5 80GB	208	1.872 TB	6 TB	200 Gbps	General large model training
a3-megagpu-8g	8x H100 SXM5 80GB	208+	Higher	6 TB	200 Gbps	Memory-intensive workloads
a3-edgegpu-8g	8x H100 SXM5 80GB	208	1.872 TB	6 TB	200 Gbps	Low-latency edge inference

A3 High has the broadest regional footprint, available in us-central1, us-east4, europe-west4, asia-southeast1, and additional zones. A3 Mega and A3 Edge have more limited availability. Pricing for A3 Mega is not published on the public GCP pricing page and requires a quota request or direct sales contact. A3 Edge pricing similarly varies by region and availability tier.

For a complete H100 SXM5 spec breakdown including memory bandwidth, Tensor Core throughput, and MIG profiles, see our NVIDIA H100 specs guide.

A3 H100 Per-Hour Pricing

The table below shows a3-highgpu-8g pricing per GPU (full node price divided by 8) in us-central1. All prices in USD.

Pricing Model	$/hr per GPU (us-central1)
On-demand	~$10.98
Spot (preemptible)	~$3.69
1yr CUD (approximate)	~$8.78
3yr CUD (approximate)	~$5.93

The CUD figures above apply standard GCP discount tiers (approximately 20% for 1yr, 46% for 3yr). Actual committed-use pricing varies by negotiation and region. Verify with the GCP pricing calculator before signing any commitment.

Regional pricing for europe-west4 and asia-southeast1 typically runs 7-15% above us-central1 rates. us-east4 is close to parity with us-central1.

A3 Mega pricing is not publicly listed. A3 Edge pricing varies by region and availability tier and should be confirmed through a quota request or GCP sales.

Pricing fluctuates based on GPU availability. The prices above are based on 22 May 2026 and may have changed. Check current GPU pricing → for live rates.

What GCP A3 Pricing Includes

The hourly rate covers the compute instance only. Here is what each a3-highgpu-8g provides:

Component	Specification
GPUs	8x NVIDIA H100 SXM5 80GB
GPU interconnect	NVLink 900 GB/s bidirectional
vCPUs	208 cores
Host RAM	1.872 TB
Local SSD	6 TB (ephemeral)
Network	200 Gbps

What the hourly rate does NOT cover:

Persistent disk: Standard HDD is approximately $0.04/GB/month, SSD is approximately $0.17/GB/month. The included 6 TB local SSD is ephemeral: data is deleted when the instance stops or is preempted.
Data egress: $0.08-$0.12/GB for data leaving the GCP VPC, and $0.01-$0.08/GB for cross-region transfers within GCP.
Managed services: Vertex AI, Managed Notebooks, and other Google-managed ML services bill separately on top of the compute rate.
Cloud NAT and IP addresses: Outbound NAT gateway usage and reserved static IP addresses carry additional fees.

Hidden Costs in GCP A3 Billing

The sticker price understates the real cost of typical AI workloads. Here is where the gap grows:

Data egress fees

GCP charges $0.08-$0.12/GB for data leaving the VPC. Training jobs with large model checkpoints that move between nodes, regions, or to external storage accumulate egress charges fast. A one-time 2TB dataset transfer out of GCP costs $160-$245 depending on destination.

Persistent disk

Local SSD is ephemeral. Stop the instance and the data is gone. Any persistent checkpoint storage requires attaching a Persistent Disk. Standard PD at $0.04/GB/month means 10TB of checkpoint storage costs $400/month on top of the GPU rate. SSD PD at $0.17/GB/month for 10TB is $1,700/month extra.

Sustained-use vs committed-use discounts

GCP applies sustained-use discounts (SUDs) automatically to on-demand instances that run more than 25% of the month, up to 30% at full monthly runtime. But SUDs and CUDs are mutually exclusive. Switching to a committed-use contract removes the automatic SUD you were already receiving on on-demand. Teams often miscalculate the CUD breakeven because they forget to account for the SUD they give up when committing.

A worked example: 500hr/month with 2TB egress and 10TB SSD storage

Cost item	GCP A3 High	Spheron On-Demand	Spheron Spot
Compute (8x H100, 500hr)	$43,920	$10,560	$6,640
Egress (2TB at $0.08/GB)	$164	$0	$0
Persistent SSD (10TB)	$1,700	$0	$0
Total	$45,784	$10,560	$6,640

Spheron's billing model: per-minute compute charges with no egress fees, and storage included during the compute session.

Region-by-Region Price Differences

A3 High availability and approximate pricing by region:

Region	Relative rate	Notes
us-central1	Base (~$10.98/hr per GPU)	Broadest availability
us-east4	~0-3% higher	Limited zones
europe-west4	~7-12% higher	Netherlands; GDPR residency option
asia-southeast1	~10-15% higher	Singapore

A3 Mega and A3 Edge have sparser coverage. As of 22 May 2026, A3 Mega is limited to select US and European regions, and A3 Edge is primarily in early-access zones. Verify zone availability before building deployment plans around specific configurations.

GCP A3 vs Spheron H100: Full TCO Comparison

Per-GPU hourly cost

Config	GCP A3 On-Demand	GCP A3 Spot	Spheron On-Demand	Spheron Spot
H100 SXM5	~$10.98	~$3.69	$2.64	$1.66

Spheron on-demand runs ~76% below GCP A3 on-demand list price; Spheron spot runs ~85% below.

Teams looking for H100 SXM5 cloud instances without requiring a GCP-native stack can eliminate the premium entirely.

Cost per million output tokens (LLaMA 3.3 70B, vLLM, FP8)

Based on 1,850 tokens/sec throughput from our vLLM vs TensorRT-LLM vs SGLang benchmarks on H100 SXM5 FP8:

Provider	$/hr per GPU	Cost per 1M output tokens
Spheron spot	$1.66	$0.25
Spheron on-demand	$2.64	$0.40
GCP A3 3yr CUD	~$5.93	~$0.89
GCP A3 1yr CUD	~$8.78	~$1.32
GCP A3 on-demand	~$10.98	~$1.65

Formula: (1,000,000 tokens) / (1,850 tok/s 3,600 s/hr) $/hr = 0.150 * $/hr

Monthly TCO for 8x H100 running 720 hours

Config	Compute only	With typical overhead
GCP A3 on-demand	~$63,302	~$65,166
GCP A3 1yr CUD	~$50,573	~$52,437
GCP A3 3yr CUD	~$34,157	~$36,021
Spheron on-demand	~$15,206	~$15,206
Spheron spot	~$9,562	~$9,562

GCP overhead uses 2TB egress at $0.08/GB ($164) plus 10TB SSD persistent disk at $0.17/GB/month ($1,700). Spheron has no egress fees and includes storage during compute sessions.

When GCP A3 Makes Sense

GCP A3 is worth the premium in these specific cases:

Deep Vertex AI integration: If Vertex Pipelines, AutoML, or Vertex AI Workbench is core infrastructure rather than incidental tooling, the tight A3 integration reduces operational overhead and narrows the effective cost gap.
BigQuery ML pipelines: Training loops that read large datasets directly from BigQuery benefit from internal transfer rates that are cheaper than cross-cloud egress. If your data is already in BigQuery and relocating it would cost more than the per-GPU premium, staying on GCP makes sense.
Compliance or data residency requirements: Workloads bound to a specific GCP-certified region with no equivalent alternative may have no practical option.
Negotiated enterprise CUD rates: Large accounts sometimes negotiate CUD pricing below the public list. At the right discount, the effective per-GPU rate can approach what independent GPU cloud providers charge on-demand.

When to Use Spheron Instead

Framework-portable workloads: Any stack built on vLLM, SGLang, TensorRT-LLM, PyTorch, or standard CUDA containers runs identically on Spheron's H100 SXM5 instances without modification.
No GCP service dependencies: If your data lives in S3-compatible storage or can be moved without significant egress cost, there is no structural reason to pay the GCP premium.
Checkpoint-heavy training: Large models generate large checkpoints. GCP's per-GB egress adds up when checkpoints cross instance, region, or cloud boundaries. Spheron's zero-egress model eliminates this variable.
On-demand without a multi-year commitment: Spheron charges per-minute with no minimum. This fits teams whose GPU utilization varies or whose model roadmap evolves faster than a 3-year commitment horizon.

Migration Considerations

Moving from GCP A3 to Spheron is primarily a container portability story. Any Docker image running on A3 runs on Spheron without modification, because both platforms provide the same H100 SXM5 hardware with the same CUDA environment.

What changes:

Storage: Replace Cloud Storage buckets with S3-compatible storage or direct instance storage. Use rclone or gsutil rsync for one-time migration. Expect $0.08/GB egress from GCP for the initial data transfer. After migration, ongoing checkpoint I/O has no egress charges.
Access model: Spheron provides SSH root access. There is no IAM role system, service account model, or VPC security group to configure. Access is key-based.
Networking: No VPC peering setup required. Instances are accessible directly via SSH with a public IP.

What stays the same:

CUDA drivers and versions
Container runtime (Docker, NVIDIA Container Toolkit)
Framework code (PyTorch, JAX, vLLM, etc.)
Data format and checkpoint structure

For a step-by-step migration guide, see Migrating from AWS, GCP, or Azure to Spheron. For comparison, Oracle Cloud's A3-equivalent BM.GPU.H100.8 shape also charges around $10/GPU/hr on-demand, close to GCP A3 High list price; see the OCI GPU pricing analysis for how OCI's Universal Credits and preemptible tiers compare.

Pricing fluctuates based on GPU availability. The prices above are based on 22 May 2026 and may have changed. Check current GPU pricing → for live rates. GCP rates are from the public GCP pricing page as of the same date; verify with the GCP pricing calculator before budgeting.

Teams running H100 inference without GCP-native service dependencies can cut compute bills significantly. Spheron provides bare-metal H100 SXM5 access through data center partners globally, with no egress fees and per-minute billing.
Spheron H100 instances → | View all GPU pricing → | Start now →

FAQ / 05

Frequently Asked Questions

GCP A3 High (a3-highgpu-8g) runs approximately $10.98/hr per GPU on-demand in us-central1 as of 22 May 2026. The full 8-GPU node is approximately $87.84/hr. Spot pricing is approximately $3.69/hr per GPU. A 1yr committed-use discount brings it to around $8.78/hr per GPU. Verify the current rate with the GCP pricing calculator before budgeting. Spheron H100 SXM5 starts at $2.64/hr per GPU on-demand or $1.66/hr spot. Check current rates at /pricing/.

Each a3-highgpu-8g instance includes 8x NVIDIA H100 SXM5 80GB GPUs, 208 vCPUs, 1.872 TB RAM, 6 TB local SSD, and 200 Gbps network. GPUs are connected via NVLink. Pricing does not include persistent disk, data egress, or managed services such as Vertex AI.

A3 High (a3-highgpu-8g) is the standard 8xH100 SXM5 configuration. A3 Mega (a3-megagpu-8g) adds more host RAM and is positioned for memory-intensive multi-tenant workloads. A3 Edge (a3-edgegpu-8g) targets lower-latency edge deployments. All three use the same H100 SXM5 80GB GPU die but differ in host configuration and availability by region.

GCP offers Spot VMs for A3 instances in select regions, at approximately $3.69/hr per GPU in us-central1 as of 22 May 2026. Spot VMs can be preempted with a short notice period and reclaimed by GCP at any time, making them unsuitable for training runs without checkpoint/resume logic. Spheron spot pricing for H100 SXM5 is $1.66/hr per GPU with per-minute billing. See current rates at /pricing/.

GCP A3 makes sense when your stack is tightly integrated with BigQuery ML, Vertex AI, or Google Cloud Storage and the migration cost of decoupling exceeds the GPU price difference. For teams running standalone training or inference without GCP-native service dependencies, Spheron H100 on-demand rates run approximately 76% lower than GCP A3 list price, and spot rates run approximately 85% lower, with no committed-use requirement.