Skip to main content
Prices are per 1M tokens and only account for the underlying cost of the model. Pricing for the model router is in addition to the model inference cost.
ModelIdentifierPrice / 1M Tokens (input) (USD)Price / 1M Tokens (output) (USD)
GPT-4ogpt-4o$5$15
GPT-4-Turbogpt-4-turbo$10$30
GPT-4gpt-4$30$60
GPT-3.5gpt-3.5-turbo$1$2
Claude 3 Opusclaude-3-opus$15$75
Claude 3 Sonnetclaude-3-sonnet$3$15
Claude 3 Haikuclaude-3-haiku$0.25$1.25
Claude 2.1claude-2.1$8$24
Claude 2claude-2$8$24
Claude Instant 1claude-instant-1$0.80$2.40
Command R+command-r-plus$3$15
Command Rcommand-r$0.50$1.50
Commandcommand$1.50$2.00
Command Lightcommand-light$0.30$0.60
Mistral Largemistral-large$8$24
Mistral Smallmistral-small$2$6
Mixtral 8x22b Instructmixtral-8x22b-instruct$0.90$0.90
Mixtral 8x7b Instructmixtral-8x7b-instruct$0.60$0.60
Mistral 7b Instructmistral-7b-instruct$0.20$0.20
Snowflake Arctic Instructsnowflake-arctic-instruct$2.40$2.40
DeepSeek Chatdeepseek-chat$0.14$0.28
DeepSeek Coderdeepseek-coder$0.14$0.28
Databricks DBRX Instructdbrx-instruct$1$1
WizardLM 2 8x22bwizardlm-2-8x22b$1$1
Llama 3 70b Instructllama-3-70b-instruct$0.90$0.90
Llama 3 8b Instructllama-3-8b-instruct$0.20$0.20
Llama 2 70b Chat HFllama-2-70b-chat-hf$1$2
Llama 2 13b Chat HFllama-2-13b-chat-hf$0.25$0.25
Llama 2 7b Chat HFllama-2-7b-chat-hf$0.2$0.2
CodeLlama 34b Instructcodellama-34b-instruct-hf$0.9$0.9
Nous Hermes 2 Yi 34Bnous-hermes-2-yi-34b$0.80$0.80
Deepseek Coder 33B Instructdeepseek-coder-33b-instruct$0.80$0.80
Phi 3 Mini 4k Instructphi-3-mini-4k-instruct$0.3$0.9
Phi 3 Mini 128k Instructphi-3-mini-128k-instruct$0.3$0.9
Phi 3 Medium 4k Instructphi-3-medium-4k-instruct$0.45$1.35
Phi 3 Medium 128k Instructphi-3-medium-128k-instruct$0.5$1.5
Google Gemma 7b Instructgoogle-gemma-7b-instruct$0.2$0.2
Google Gemma 2b Instructgoogle-gemma-2b-instruct$0.1$0.1
I