How to Run a Pearl Research Node on GPU Cloud: H100 and H200 Setup Guide (2026)

Pearl Research launched mainnet on April 27, 2026. The node software requires a Hopper-class GPU (SM compute capability 9.0) because the MatMul proof kernel uses Hopper-specific tensor core instructions. That means H100 80GB minimum, H200 141GB recommended. RTX 4090, RTX 5090, and A100 are all excluded by the official miner binary. This guide covers provisioning an H100 or H200 on Spheron, building the pearl-research-labs/pearl daemon and vLLM miner from source, launching pearld, generating a Taproot wallet, and monitoring your participation in Pearl's proof-of-useful-work (PoUW) network.

TL;DR

Hardware minimum: H100 80GB SXM5, Ubuntu 22.04, CUDA 12.x, 64 GB RAM, 200 GB NVMe
Software stack: Docker, NVIDIA Container Toolkit, go-task, pearld, prlctl, pearl-miner (vLLM-based)
Time to first sync: approximately 30-90 minutes on a fresh instance
On-demand cost (as of 17 May 2026): H100 SXM5 from $3.90/hr (~$2,847/mo) on-demand or $1.63/hr spot; H200 SXM5 from $4.62/hr (~$3,373/mo) on-demand or $1.92/hr spot on Spheron

What is Pearl Research?

According to the Pearl Research whitepaper, the goal is to "replace proof-of-work hash computation with verifiable, high-utility matrix multiplication that directly serves AI inference workloads." Instead of mining blocks by hashing nonces, Pearl node operators contribute GPU MatMul compute that gets verified with Plonky2 zkSNARK proofs. The resulting computation is both the consensus mechanism and the productive output.

The PRL token is the chain-native participation reward. Pearl mainnet launched April 27, 2026.

A few things to be clear about: this guide covers infrastructure setup costs only. How much PRL you earn depends on network participation rates, chain state, and your hardware's contribution relative to other nodes. Do not treat any figure in this guide as an income projection.

Why Operators Run a Pearl Node

The primary draw is the PoUW model: your GPU is contributing verified MatMul compute to the network rather than discarding compute on hash puzzles. The chain treats this compute as genuinely useful because it can be consumed by AI inference workloads.

Pearl's documentation also describes a dual-utility path: the same H100 or H200 runs PoUW participation via the miner and serves real AI inference via vLLM simultaneously. The Pearl-certified inference path uses Llama 3.3 70B. If you are already running AI inference infrastructure, the marginal cost of adding a Pearl node is the 2-3 GB of VRAM overhead for the miner process.

For context on how similar PoUW operator infrastructure works on Spheron, see the Gonka partner page.

Hardware Requirements

Component	Minimum	Recommended
GPU	H100 80GB (Hopper SM 9.0)	H200 141GB
System RAM	64 GB	128 GB
NVMe storage	200 GB	500 GB
Network	~100 Mbps	1 Gbps
OS	Ubuntu 22.04	Ubuntu 22.04
CUDA	12.x	12.x

The SM 9.0 requirement is a hard dependency, not a recommendation. RTX 4090 (SM 8.9, Ada Lovelace), A100 (SM 8.0, Ampere), and RTX 5090 (SM 12.0, consumer Blackwell) are not supported by the official miner as of May 2026. The binary compiles with Hopper-specific MatMul kernel instructions that do not exist on other architectures. Blackwell support may come in a future Pearl release.

H100 vs H200 for Pearl: Which to Pick

The short version: H100 80GB works but requires FP8 quantization for the dual-utility path. H200 141GB also requires FP8 for full-context dual-utility; BF16 on H200 is only viable with a reduced context window because the 70B weights alone consume roughly 140 GB of the 141 GB available.

H100 SXM5 80GB:

$3.90/hr on-demand, ~$1.63/hr spot (as of 17 May 2026)
Monthly on-demand cost: ~$2,847 at 100% uptime
FP8 is required to serve Llama 3.3 70B alongside the miner on a single card
Spot pricing available, but preemption interrupts sync state

H200 SXM5 141GB:

$4.62/hr on-demand, ~$1.92/hr spot (as of 17 May 2026)
Monthly on-demand cost: ~$3,373 at 100% uptime
FP8 recommended for full-context dual-utility; BF16 viable only with reduced --max-model-len
Better long-context throughput for inference workloads

Pricing fluctuates based on GPU availability. The prices above are based on 17 May 2026 and may have changed. Check current GPU pricing for live rates.

For the PoUW-only path (no vLLM inference), H100 80GB is the more cost-efficient choice. For the dual-utility path where you also want to serve AI workloads, H200 141GB removes the VRAM juggling.

Explore bare-metal H100 SXM5 on Spheron for specs and availability. For the larger option, see H200 141GB instances on Spheron.

For a detailed throughput comparison across GPU generations, see the best GPU for AI inference in 2026 guide.

Step 1: Provision Your Spheron Instance

Log into app.spheron.ai, select H100 SXM5 80GB or H200 SXM5 141GB, pick Ubuntu 22.04 as the base OS, and attach at least 200 GB NVMe storage. If you plan to run the vLLM dual-utility path with Llama 3.3 70B weights, provision at least 300 GB to hold both the model checkpoint and the Pearl binaries.

Copy your SSH key and connect to the instance. Then verify the GPU:

bash

nvidia-smi --query-gpu=name,compute_cap,memory.total --format=csv

You should see compute_cap = 9.0 for H100 or H200. If you see anything other than 9.0, stop here: the pearl-miner binary will not compile or run on that hardware.

Step 2: Install Docker and NVIDIA Container Toolkit

Most Spheron GPU instances come with the NVIDIA Container Toolkit pre-installed. If not, install Docker and the toolkit:

bash

# Install Docker
curl -fsSL https://get.docker.com | sh

# Add NVIDIA Docker repo
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list \
  | tee /etc/apt/sources.list.d/nvidia-docker.list

# Install toolkit
apt-get update && apt-get install -y nvidia-container-toolkit
nvidia-ctk runtime configure --runtime=docker
systemctl restart docker

Validate GPU passthrough:

bash

docker run --rm --gpus all nvidia/cuda:12.2.0-base-ubuntu22.04 nvidia-smi

You should see your H100 or H200 listed with full VRAM. If the GPU does not appear, verify that nvidia-ctk runtime configure --runtime=docker ran without errors and that the Docker daemon restarted.

Step 3: Clone the Pearl Repo and Install Task

bash

git clone https://github.com/pearl-research-labs/pearl
cd pearl

Pearl's build system uses go-task. Install it:

bash

sh -c "$(curl --location https://taskfile.dev/install.sh)" -- -d -b /usr/local/bin
task --version

Note: Verify the binary names and task target names (task build:blockchain, task build:miner) against the actual Taskfile.yml in the repo before running them. The Pearl repo is actively developed and target names may change between releases. If the targets differ from what is listed here, consult the Taskfile.yml directly.

Step 4: Build the Blockchain and Miner Binaries

The build requires Go 1.21+ and a CUDA 12.x toolchain. Verify both before building:

bash

go version        # should be 1.21 or higher
nvcc --version    # should show CUDA 12.x

Then build:

bash

task build:blockchain
task build:miner

The miner binary compiles with SM 9.0 targeting. If you attempt to build on a non-Hopper machine, the link step will fail with a CUDA capability error. This is expected: run the build on the same H100 or H200 instance you plan to run it on.

If the build fails with a Go version error, install the required Go version from go.dev/dl and re-run.

Step 5: Launch pearld and Sync

Start the Pearl daemon:

bash

./pearld --network mainnet --datadir ~/.pearl

How Long Does Sync Take?

Initial sync from genesis takes 30-90 minutes depending on chain height at the time of setup. As the chain grows, this window will increase. Monitor progress with:

bash

./prlctl status

Or check your node's block height against the Pearl explorer at explorer.pearlresearch.ai.

Do not start the miner until the daemon is fully synced. Starting the miner before sync completes will result in rejected proof submissions.

Step 6: Generate a Taproot Wallet

bash

./prlctl wallet new --taproot

This generates a Taproot address compatible with the PRL network. The output includes a mnemonic seed phrase. Back this up immediately to offline storage before proceeding. Loss of the seed phrase means loss of any accumulated PRL.

Pearl's documentation mentions an optional oyster hardware wallet integration for custody separation. If you want to use hardware custody rather than a software-generated key, consult the Pearl documentation for the oyster path. The exact oyster commands depend on Pearl's current release; do not run commands from memory for hardware wallet operations.

Step 7: Start the vLLM Miner

H100 80GB Path (FP8 Required)

On H100 80GB, Llama 3.3 70B weights in FP8 occupy roughly 70 GB, leaving approximately 7-8 GB for KV cache and the 2-3 GB miner overhead. FP8 is not optional here.

bash

./pearl-miner \
  --wallet <your-taproot-address> \
  --model meta-llama/Llama-3.3-70B-Instruct \
  --dtype fp8 \
  --gpu-memory-utilization 0.90

H200 141GB Path (FP8 Recommended; BF16 Requires Reduced Context)

The 70B BF16 weights occupy roughly 140 GB, leaving about 1 GB before the miner's 2-3 GB overhead and any KV cache. Running BF16 at full context will OOM. For the full dual-utility path on H200, use FP8:

bash

./pearl-miner \
  --wallet <your-taproot-address> \
  --model meta-llama/Llama-3.3-70B-Instruct \
  --dtype fp8 \
  --gpu-memory-utilization 0.90

If you specifically need BF16 for inference quality, you must reduce the context window to shrink KV cache allocation:

bash

./pearl-miner \
  --wallet <your-taproot-address> \
  --model meta-llama/Llama-3.3-70B-Instruct \
  --dtype bfloat16 \
  --max-model-len 8192 \
  --gpu-memory-utilization 0.90

The --gpu-memory-utilization 0.90 cap is important: the miner process needs approximately 2-3 GB of overhead for the MatMul proof workload. Letting vLLM claim the full 100% of VRAM will cause the miner to OOM.

For multi-GPU tensor parallelism setups, see the vLLM production deployment guide.

Verification and Monitoring

Once the miner is running, watch GPU VRAM usage to confirm the miner and inference workload are coexisting:

bash

nvidia-smi dmon -s pum -d 10

Expected miner VRAM overhead: 2-3 GB on top of whatever the vLLM model checkpoint occupies. If VRAM usage is higher than expected, check for multiple miner processes or a KV cache leak.

Key metrics to watch:

Block height (via ./prlctl status): should increase steadily
Miner participation status: should show accepted proof submissions
VRAM utilization: should stay below 90% to leave room for the miner's proof overhead

On-Demand vs Spot: Cost Considerations

GPU	On-Demand	Spot	Monthly (on-demand, 100% uptime)
H100 SXM5 80GB	$3.90/hr	~$1.63/hr	~$2,847
H200 SXM5 141GB	$4.62/hr	~$1.92/hr	~$3,373

Prices as of 17 May 2026.

On-demand: Guaranteed availability with no preemption. Better for PoUW participation that requires continuous uptime, because preemption interrupts sync state and forces a re-sync.

Spot: 30-50% cheaper in typical markets, but preemption risk is real. A preempted spot instance loses its sync state and must re-sync from the last checkpoint before the miner can resume submitting proofs. For PoUW participation, this means a 30-90 minute gap in proof submissions each time the instance is preempted. If your miner can tolerate gaps, spot is viable. If continuous participation is important, use on-demand.

For a detailed breakdown of the billing model tradeoffs, see the serverless vs on-demand vs reserved GPU guide.

Troubleshooting

SM 9.0 Kernel Errors

Symptom: Miner fails at startup with a CUDA kernel error mentioning unsupported compute capability.

Cause: You are running on a non-Hopper GPU.

Fix: Verify nvidia-smi --query-gpu=name,compute_cap --format=csv shows 9.0. If it shows anything else, you need to switch to an H100 or H200 instance.

CUDA 12.x Mismatch

Symptom: Build fails with a CUDA version incompatibility error, or the miner crashes at startup with a CUDA library version error.

Fix: Run nvcc --version and verify the output is CUDA 12.x. Also verify the NVIDIA Container Toolkit version aligns with the driver on the host. Run nvidia-ctk --version and check the Pearl repo for supported combinations.

Sync Stalls

Symptom: Block height in ./prlctl status stops increasing.

Fix: First check network connectivity. If the connection is fine, try restarting pearld with the --resync flag:

bash

./pearld --network mainnet --datadir ~/.pearl --resync

Note that --resync triggers a full re-download from the last checkpoint, which can take 30-90 minutes depending on chain height.

OOM on Miner Start

Symptom: Miner process OOMs immediately after starting.

Fix on H100 (FP8): Reduce --gpu-memory-utilization to 0.85 to leave more room for the miner.

Fix on H200: BF16 leaves almost no margin after 70B weights consume ~140 GB. Switch to FP8 for the full dual-utility path, or if you need BF16, reduce --max-model-len (e.g., to 8192) to shrink KV cache allocation.

If OOM persists on H100 even with FP8 and 0.85 utilization, it means the KV cache plus the miner overhead is still exceeding available VRAM. Reduce --max-model-len to shrink the KV cache allocation.

Pearl Research requires Hopper-class bare metal, and Spheron stocks H100 and H200 SXM5 instances you can spin up in minutes. Same GPU runs PoUW participation and your vLLM inference workloads side by side.
Spheron H100 → | On-demand H200 → | View current GPU pricing →
Start on Spheron →

STEPS / 07

Quick Setup Guide

Provision a Spheron GPU instance
Log in at app.spheron.ai, select an H100 80GB or H200 141GB instance, choose Ubuntu 22.04 as the base OS, and attach at least 200 GB NVMe storage. For the vLLM dual-utility path, you need at least 300 GB to store the Llama 3.3 70B weights alongside the Pearl binaries.
Install Docker and NVIDIA Container Toolkit
SSH into the instance and run: curl -fsSL https://get.docker.com | sh && distribution=$(. /etc/os-release;echo $ID$VERSION_ID) && curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | apt-key add - && curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | tee /etc/apt/sources.list.d/nvidia-docker.list && apt-get update && apt-get install -y nvidia-container-toolkit && nvidia-ctk runtime configure --runtime=docker && systemctl restart docker. Validate with: docker run --rm --gpus all nvidia/cuda:12.2.0-base-ubuntu22.04 nvidia-smi
Clone the Pearl repo and install Task
Run: git clone https://github.com/pearl-research-labs/pearl && cd pearl. Install the go-task task runner with: sh -c "$(curl --location https://taskfile.dev/install.sh)" -- -d -b /usr/local/bin. Verify with: task --version.
Build blockchain and miner binaries
Run: task build:blockchain followed by task build:miner. The build requires Go 1.21+ and a CUDA 12.x toolchain. If the build fails with a CUDA version error, verify your CUDA installation with nvcc --version. The miner binary will be compiled with SM 9.0 targeting; builds on non-Hopper machines will fail at link time.
Launch pearld and sync with the network
Start the Pearl daemon with: ./pearld --network mainnet --datadir ~/.pearl. Monitor sync progress against explorer.pearlresearch.ai. The initial sync from genesis can take 30-90 minutes depending on chain height at the time of setup.
Generate a Taproot wallet with prlctl
Run: ./prlctl wallet new --taproot. This generates a Taproot address compatible with the PRL network. Back up the mnemonic seed phrase immediately. Optionally, configure an oyster hardware wallet path if you prefer custody separation.
Start the vLLM miner
Launch the miner with: ./pearl-miner --wallet <your-taproot-address> --model meta-llama/Llama-3.3-70B-Instruct --dtype fp8 --gpu-memory-utilization 0.90. The --dtype fp8 flag is required to keep VRAM headroom for the MatMul proof workload on both H100 80GB and H200 141GB. On H200 141GB, BF16 is only viable with a significantly reduced --max-model-len to shrink KV cache allocation and make room for the miner's 2-3 GB overhead, since 70B BF16 weights alone consume roughly 140 GB of the 141 GB available.

FAQ / 06

Frequently Asked Questions

Pearl Research requires a Hopper-class GPU with SM compute capability 9.0 or higher. The official miner supports the H100 80GB (minimum) and H200 141GB (recommended). RTX 4090 is not supported by the official miner because it is Ada Lovelace (SM 8.9), and the MatMul kernel relies on Hopper-specific tensor core instructions unavailable on Ada Lovelace or Ampere.

Pearl Research's PoUW mechanism lets operators earn PRL by contributing verified GPU MatMul compute to the network. Each contribution is proven with Plonky2 zkSNARK proofs rather than wasted hash work, meaning the compute serves real mathematical utility while simultaneously securing the chain.

The Pearl vLLM miner itself uses approximately 2-3 GB of VRAM for the MatMul proof workload. Running it alongside a Llama 3.3 70B checkpoint in FP8 on a single H100 80GB requires careful GPU memory budgeting: 70B FP8 weights occupy roughly 70 GB, leaving around 7-8 GB for KV cache and the miner process. On an H200 141GB, the headroom is much more comfortable.

Yes. Pearl's own documentation describes a dual-utility path: the same GPU runs PoUW participation via the miner and serves real AI inference via vLLM. The Pearl-certified path uses Llama 3.3 70B. On H200 141GB, the 70B BF16 weights alone occupy roughly 140 GB, leaving almost no room for the miner's 2-3 GB overhead and KV cache. BF16 on H200 is only viable if you reduce --max-model-len significantly to shrink KV cache allocation. For the full dual-utility path on H200, FP8 is the practical choice.

Costs depend on on-demand vs spot pricing and uptime requirements. At on-demand rates from Spheron's live marketplace, H100 80GB instances start at $3.90/hr (~$2,847/mo at 100% uptime). H200 141GB starts at $4.62/hr (~$3,373/mo at 100% uptime). Spot pricing reduces the hourly rate significantly: H100 spot is ~$1.63/hr (~$1,190/mo) and H200 spot is ~$1.92/hr (~$1,402/mo) for operators who can tolerate preemption windows. Check the current pricing at spheron.network/pricing/.

No. The official pearl miner requires SM compute capability 9.0 (Hopper). RTX 4090 is SM 8.9 (Ada Lovelace) and RTX 5090 is SM 12.0 (consumer Blackwell). As of the May 2026 release, only Hopper H100 and H200 are officially supported. Blackwell support may come in a future release as the miner codebase evolves.

TL;DR

What is Pearl Research?

Why Operators Run a Pearl Node

Hardware Requirements

H100 vs H200 for Pearl: Which to Pick

Step 1: Provision Your Spheron Instance

Step 2: Install Docker and NVIDIA Container Toolkit

Step 3: Clone the Pearl Repo and Install Task

Step 4: Build the Blockchain and Miner Binaries

Step 5: Launch pearld and Sync

How Long Does Sync Take?

Step 6: Generate a Taproot Wallet

Step 7: Start the vLLM Miner

H100 80GB Path (FP8 Required)

H200 141GB Path (FP8 Recommended; BF16 Requires Reduced Context)

Verification and Monitoring

On-Demand vs Spot: Cost Considerations

Troubleshooting

SM 9.0 Kernel Errors

CUDA 12.x Mismatch

Sync Stalls

OOM on Miner Start

Quick Setup Guide

Provision a Spheron GPU instance

Install Docker and NVIDIA Container Toolkit

Clone the Pearl repo and install Task

Build blockchain and miner binaries

Launch pearld and sync with the network

Generate a Taproot wallet with prlctl

Start the vLLM miner

Frequently Asked Questions

01What GPU does Pearl Research require to run a node?

02What is proof of useful work (PoUW) in Pearl Research?

03How much VRAM does the Pearl vLLM miner consume?

04Can I run AI inference workloads alongside a Pearl Research node?

05What does it cost per month to run a Pearl Research node on Spheron?

06Does Pearl Research work on RTX GPUs?

Build what's next.