Free Tool

AI & LLM Training Cost Calculator

Enter a model size, a dataset size, and the GPU you want. Get the total compute, the wall-clock time, and the GPU training cost on Spheron, with live pricing benchmarked against AWS and GCP. Works for pretraining a model from scratch or fine-tuning an LLM with LoRA.

Live Spheron pricingPer-GPU per-hour ratesNo signup needed

Inputs

70B parameters
1.0T tokens

Modeling assumptions

Details →
Compute6 × params × tokens
Throughput · FP16700 TFLOPs
Utilization (MFU)45%
Egress fees$0
Per-minute billing, no commitment.

Cost on Spheron

$743,769
Wall-clock1929d
Total compute420.00 ZFLOPs
$/B tokens$744

This run takes over a year. Consider more GPUs or a smaller model.

Cost breakdown

Spheron on-demand$743,769
Spheron spot$552,296
Spheron reserved (35% off)$483,450
AWS on-demand$1.44M
Google Cloud on-demand$1.11M
Azure on-demand$2.59M

You save $700,676 versus AWS (49%).

View All Pricing

Estimates use 8× H100 at $2.01/hr on-demand ($1.49/hr spot). AWS and GCP figures are public list prices and exclude reserved discounts, egress, and storage. Math: how this is calculated.

Training cost examples: Llama 3 70B, 405B, and a 7B LoRA

Click any example to load it into the calculator. Numbers are based on published training runs and live Spheron pricing.

How much does it cost to train an LLM?

Training an LLM costs anywhere from under $100 to tens of millions of dollars. Three inputs set the number: the size of the model, the number of tokens you train on, and the per-hour rate of the GPU you run on. The calculator above turns those into a dollar figure for any model. To put the range in context, here is what four common jobs cost on live Spheron H100 pricing.

7B LoRA fine-tune (1B tokens)~25 H100-hours
around $50
70B supervised fine-tunea couple thousand GPU-hours
$11,200 in a real run
70B pretrain from scratch (15T tokens)a few million H100-hours
seven figures
405B pretrain (15.6T tokens)tens of millions of H100-hours
eight figures

The jump from fine-tuning to pretraining is why almost no one trains from scratch. Fine-tuning an open model on on-demand H100 access gets you a custom model for a rounding error against a from-scratch run. For a full worked example, see how a small team trained a 70B model for $11,200 on spot GPUs, then price your own job in the calculator above.

How LLM training cost is calculated

Training a model takes a known amount of compute. The standard rule of thumb is 6 × parameters × tokens FLOPs for full training. LoRA fine-tunes use about two-thirds of that: the base weights are frozen, so you skip computing their gradients, but the full forward pass and the activation-gradient backward pass still run.

We take that FLOPs total and divide by your GPU's sustained throughput at the chosen precision. H100 delivers roughly 700 TFLOPs at FP16 and 1400 TFLOPs at FP8 in real training jobs. B200 roughly doubles those numbers.

Then we apply model FLOPs utilization (MFU), the fraction of peak throughput you actually get. Well-tuned dense transformer training on H100 lands around 0.45. The calculator defaults to that and lets you tune it in the advanced panel.

Cost is then wall-clock hours × GPU count × per-GPU rate. Spheron rates come live from the marketplace API. AWS, GCP, and Azure rates are public list prices for the same GPU and exclude reserved discounts, egress, and storage.

GPU Infrastructure

Not sure which GPU to pick?

Browse the full GPU catalog with live per-hour pricing across H100, H200, B200, B300, A100, L40S, RTX PRO 6000, and more. Per-minute billing, no commitment.

Deploy Time
< 2 min
Uptime SLA
99.9%
GPU Models
10+
Starting At
$0.58/hr
FAQ / 09

Training cost calculator FAQ