Model Pricing

Prices are per 1M tokens and only account for the underlying cost of the model. Pricing for the model router is in addition to the model inference cost.

Model	Identifier	Price / 1M Tokens (input) (USD)	Price / 1M Tokens (output) (USD)
GPT-4o	`gpt-4o`	$5	$15
GPT-4-Turbo	`gpt-4-turbo`	$10	$30
GPT-4	`gpt-4`	$30	$60
GPT-3.5	`gpt-3.5-turbo`	$1	$2
Claude 3 Opus	`claude-3-opus`	$15	$75
Claude 3 Sonnet	`claude-3-sonnet`	$3	$15
Claude 3 Haiku	`claude-3-haiku`	$0.25	$1.25
Claude 2.1	`claude-2.1`	$8	$24
Claude 2	`claude-2`	$8	$24
Claude Instant 1	`claude-instant-1`	$0.80	$2.40
Command R+	`command-r-plus`	$3	$15
Command R	`command-r`	$0.50	$1.50
Command	`command`	$1.50	$2.00
Command Light	`command-light`	$0.30	$0.60
Mistral Large	`mistral-large`	$8	$24
Mistral Small	`mistral-small`	$2	$6
Mixtral 8x22b Instruct	`mixtral-8x22b-instruct`	$0.90	$0.90
Mixtral 8x7b Instruct	`mixtral-8x7b-instruct`	$0.60	$0.60
Mistral 7b Instruct	`mistral-7b-instruct`	$0.20	$0.20
Snowflake Arctic Instruct	`snowflake-arctic-instruct`	$2.40	$2.40
DeepSeek Chat	`deepseek-chat`	$0.14	$0.28
DeepSeek Coder	`deepseek-coder`	$0.14	$0.28
Databricks DBRX Instruct	`dbrx-instruct`	$1	$1
WizardLM 2 8x22b	`wizardlm-2-8x22b`	$1	$1
Llama 3 70b Instruct	`llama-3-70b-instruct`	$0.90	$0.90
Llama 3 8b Instruct	`llama-3-8b-instruct`	$0.20	$0.20
Llama 2 70b Chat HF	`llama-2-70b-chat-hf`	$1	$2
Llama 2 13b Chat HF	`llama-2-13b-chat-hf`	$0.25	$0.25
Llama 2 7b Chat HF	`llama-2-7b-chat-hf`	$0.2	$0.2
CodeLlama 34b Instruct	`codellama-34b-instruct-hf`	$0.9	$0.9
Nous Hermes 2 Yi 34B	`nous-hermes-2-yi-34b`	$0.80	$0.80
Deepseek Coder 33B Instruct	`deepseek-coder-33b-instruct`	$0.80	$0.80
Phi 3 Mini 4k Instruct	`phi-3-mini-4k-instruct`	$0.3	$0.9
Phi 3 Mini 128k Instruct	`phi-3-mini-128k-instruct`	$0.3	$0.9
Phi 3 Medium 4k Instruct	`phi-3-medium-4k-instruct`	$0.45	$1.35
Phi 3 Medium 128k Instruct	`phi-3-medium-128k-instruct`	$0.5	$1.5
Google Gemma 7b Instruct	`google-gemma-7b-instruct`	$0.2	$0.2
Google Gemma 2b Instruct	`google-gemma-2b-instruct`	$0.1	$0.1

Get Started

Inference Engines

Model Gateway

Router Tags

Structured Outputs

Integrations

Pricing