Pricing

You only pay for what you use on Replicate. Some models are billed by hardware and time, others by input and output.

Public models

Thousands of open-source machine learning models have been contributed by our community and more are added every day. We also host a wide variety of proprietary models.

Most models are billed by the time they take to run. The price-per-second varies according to the hardware in use. When running or training one of these public models, you only pay for the time it takes to process your request.

Some models are billed by input and output. We've included some examples below.

You'll find estimates for how much any model will cost you on the model's page.

anthropic/claude-3.7-sonnet

The most intelligent Claude model and the first hybrid reasoning model on the market (claude-3-7-sonnet-20250219)

$0.015 / thousand output tokens

$3.00 / million input tokens

black-forest-labs/flux-1.1-pro

Faster, better FLUX Pro. Text-to-image model with excellent image quality, prompt adherence, and output diversity.

$0.04 / output image

black-forest-labs/flux-dev

A 12 billion parameter rectified flow transformer capable of generating images from text descriptions

$0.025 / output image

black-forest-labs/flux-schnell

The fastest image generation model tailored for local development and personal use

$3.00 / thousand output images

deepseek-ai/deepseek-r1

A reasoning model trained with reinforcement learning, on par with OpenAI o1

$0.01 / thousand output tokens

$3.75 / million input tokens

google/veo-2

State of the art video generation model. Veo 2 can faithfully follow simple and complex instructions, and convincingly simulates real-world physics as well as a wide range of visual styles.

$0.50 / second of output video

ideogram-ai/ideogram-v3-quality

The highest quality Ideogram v3 model. v3 creates images with stunning realism, creative designs, and consistent styles

$0.09 / output image

recraft-ai/recraft-v3

Recraft V3 (code-named red_panda) is a text-to-image model with the ability to generate long texts, and images in a wide list of styles. As of today, it is SOTA in image generation, proven by the Text-to-Image Benchmark by Artificial Analysis

$0.04 / output image

wavespeedai/wan-2.1-i2v-480p

Accelerated inference for Wan 2.1 14B image to video, a comprehensive and open suite of video foundation models that pushes the boundaries of video generation.

$0.09 / second of output video

wavespeedai/wan-2.1-i2v-720p

Accelerated inference for Wan 2.1 14B image to video with high resolution, a comprehensive and open suite of video foundation models that pushes the boundaries of video generation.

$0.25 / second of output video

Private models

You aren't limited to the public models on Replicate: you can deploy your own custom models using Cog, our open-source tool for packaging machine learning models.

Unlike public models, most private models (with the exception of fast booting fine-tunes) run on dedicated hardware so you don't have to share a queue with anyone else. This means you pay for all the time instances of the model are online: the time they spend setting up; the time they spend idle, waiting for requests; and the time they spend active, processing your requests. If you get a ton of traffic, we automatically scale up and down to handle the demand.

For fast booting fine-tunes you'll only be billed for the time the model is active and processing your requests, so you won't pay for idle time like with other private models. Fast booting fine-tunes are labeled as such in the model's version list.

Hardware pricing

CPU
cpu

$0.000100/sec
$0.36/hr

GPU: -

CPU: 4x

GPU RAM: -

RAM: 8GB

Nvidia A100 (80GB) GPU
gpu-a100-large

$0.001400/sec
$5.04/hr

GPU: 1x

CPU: 10x

GPU RAM: 80GB

RAM: 144GB

2x Nvidia A100 (80GB) GPU
gpu-a100-large-2x

$0.002800/sec
$10.08/hr

GPU: 2x

CPU: 20x

GPU RAM: 160GB

RAM: 288GB

4x Nvidia A100 (80GB) GPU
gpu-a100-large-4x

$0.005600/sec
$20.16/hr

GPU: 4x

CPU: 40x

GPU RAM: 320GB

RAM: 576GB

8x Nvidia A100 (80GB) GPU
gpu-a100-large-8x

$0.011200/sec
$40.32/hr

GPU: 8x

CPU: 80x

GPU RAM: 640GB

RAM: 960GB

Nvidia H100 GPU
gpu-h100

$0.001525/sec
$5.49/hr

GPU: 1x

CPU: 13x

GPU RAM: 80GB

RAM: 72GB

Nvidia L40S GPU
gpu-l40s

$0.000975/sec
$3.51/hr

GPU: 1x

CPU: 10x

GPU RAM: 48GB

RAM: 65GB

2x Nvidia L40S GPU
gpu-l40s-2x

$0.001950/sec
$7.02/hr

GPU: 2x

CPU: 20x

GPU RAM: 96GB

RAM: 144GB

Nvidia T4 GPU
gpu-t4

$0.000225/sec
$0.81/hr

GPU: 1x

CPU: 4x

GPU RAM: 16GB

RAM: 16GB

Additional hardware

2x Nvidia H100 GPU
gpu-h100-2x

$0.003050/sec
$10.98/hr

Additional H100 capacity is reserved for committed spend contracts.

4x Nvidia H100 GPU
gpu-h100-4x

$0.006100/sec
$21.96/hr

Additional H100 capacity is reserved for committed spend contracts.

8x Nvidia H100 GPU
gpu-h100-8x

$0.012200/sec
$43.92/hr

Additional H100 capacity is reserved for committed spend contracts.

Hardware	Price	GPU	CPU	GPU RAM	RAM
CPU cpu	$0.000100/sec $0.36/hr	-	4x	-	8GB
Nvidia A100 (80GB) GPU gpu-a100-large	$0.001400/sec $5.04/hr	1x	10x	80GB	144GB
2x Nvidia A100 (80GB) GPU gpu-a100-large-2x	$0.002800/sec $10.08/hr	2x	20x	160GB	288GB
4x Nvidia A100 (80GB) GPU gpu-a100-large-4x	$0.005600/sec $20.16/hr	4x	40x	320GB	576GB
8x Nvidia A100 (80GB) GPU gpu-a100-large-8x	$0.011200/sec $40.32/hr	8x	80x	640GB	960GB
Nvidia H100 GPU gpu-h100	$0.001525/sec $5.49/hr	1x	13x	80GB	72GB
Nvidia L40S GPU gpu-l40s	$0.000975/sec $3.51/hr	1x	10x	48GB	65GB
2x Nvidia L40S GPU gpu-l40s-2x	$0.001950/sec $7.02/hr	2x	20x	96GB	144GB
Nvidia T4 GPU gpu-t4	$0.000225/sec $0.81/hr	1x	4x	16GB	16GB
Additional hardware
2x Nvidia H100 GPU gpu-h100-2x	$0.003050/sec $10.98/hr	Additional H100 capacity is reserved for committed spend contracts.
4x Nvidia H100 GPU gpu-h100-4x	$0.006100/sec $21.96/hr	Additional H100 capacity is reserved for committed spend contracts.
8x Nvidia H100 GPU gpu-h100-8x	$0.012200/sec $43.92/hr	Additional H100 capacity is reserved for committed spend contracts.

Learn more

For a deeper dive, check out how billing works on Replicate.

Enterprise & volume discounts

If you need more support or have complex requirements, we can offer:

Dedicated account manager
Priority support
Higher GPU limits
Performance SLAs
Help with onboarding, custom models, and optimizations

We've also got volume discounts for large amounts of spend. Email us at sales@replicate.com to learn more.