You only pay for what you use on Replicate. Some models are billed by hardware and time, others by input and output.
Thousands of open-source machine learning models have been contributed by our community and more are added every day. We also host a wide variety of proprietary models.
Most models are billed by the time they take to run. The price-per-second varies according to the hardware in use. When running or training one of these public models, you only pay for the time it takes to process your request.
Some models are billed by input and output. We've included some examples below.
You'll find estimates for how much any model will cost you on the model's page.
The most intelligent Claude model and the first hybrid reasoning model on the market (claude-3-7-sonnet-20250219)
Faster, better FLUX Pro. Text-to-image model with excellent image quality, prompt adherence, and output diversity.
A 12 billion parameter rectified flow transformer capable of generating images from text descriptions
The fastest image generation model tailored for local development and personal use
A reasoning model trained with reinforcement learning, on par with OpenAI o1
State of the art video generation model. Veo 2 can faithfully follow simple and complex instructions, and convincingly simulates real-world physics as well as a wide range of visual styles.
The highest quality Ideogram v3 model. v3 creates images with stunning realism, creative designs, and consistent styles
Recraft V3 (code-named red_panda) is a text-to-image model with the ability to generate long texts, and images in a wide list of styles. As of today, it is SOTA in image generation, proven by the Text-to-Image Benchmark by Artificial Analysis
Accelerated inference for Wan 2.1 14B image to video, a comprehensive and open suite of video foundation models that pushes the boundaries of video generation.
Accelerated inference for Wan 2.1 14B image to video with high resolution, a comprehensive and open suite of video foundation models that pushes the boundaries of video generation.
You aren't limited to the public models on Replicate: you can deploy your own custom models using Cog, our open-source tool for packaging machine learning models.
Unlike public models, most private models (with the exception of fast booting fine-tunes) run on dedicated hardware so you don't have to share a queue with anyone else. This means you pay for all the time instances of the model are online: the time they spend setting up; the time they spend idle, waiting for requests; and the time they spend active, processing your requests. If you get a ton of traffic, we automatically scale up and down to handle the demand.
For fast booting fine-tunes you'll only be billed for the time the model is active and processing your requests, so you won't pay for idle time like with other private models. Fast booting fine-tunes are labeled as such in the model's version list.
Hardware | Price | GPU | CPU | GPU RAM | RAM |
---|---|---|---|---|---|
CPU cpu | $0.000100/sec $0.36/hr | - | 4x | - | 8GB |
Nvidia A100 (80GB) GPU gpu-a100-large | $0.001400/sec $5.04/hr | 1x | 10x | 80GB | 144GB |
2x Nvidia A100 (80GB) GPU gpu-a100-large-2x | $0.002800/sec $10.08/hr | 2x | 20x | 160GB | 288GB |
4x Nvidia A100 (80GB) GPU gpu-a100-large-4x | $0.005600/sec $20.16/hr | 4x | 40x | 320GB | 576GB |
8x Nvidia A100 (80GB) GPU gpu-a100-large-8x | $0.011200/sec $40.32/hr | 8x | 80x | 640GB | 960GB |
Nvidia H100 GPU gpu-h100 | $0.001525/sec $5.49/hr | 1x | 13x | 80GB | 72GB |
Nvidia L40S GPU gpu-l40s | $0.000975/sec $3.51/hr | 1x | 10x | 48GB | 65GB |
2x Nvidia L40S GPU gpu-l40s-2x | $0.001950/sec $7.02/hr | 2x | 20x | 96GB | 144GB |
Nvidia T4 GPU gpu-t4 | $0.000225/sec $0.81/hr | 1x | 4x | 16GB | 16GB |
Additional hardware | |||||
2x Nvidia H100 GPU gpu-h100-2x | $0.003050/sec $10.98/hr | Additional H100 capacity is reserved for committed spend contracts. | |||
4x Nvidia H100 GPU gpu-h100-4x | $0.006100/sec $21.96/hr | Additional H100 capacity is reserved for committed spend contracts. | |||
8x Nvidia H100 GPU gpu-h100-8x | $0.012200/sec $43.92/hr | Additional H100 capacity is reserved for committed spend contracts. |
For a deeper dive, check out how billing works on Replicate.
If you need more support or have complex requirements, we can offer:
We've also got volume discounts for large amounts of spend. Email us at sales@replicate.com to learn more.