Prices are per 1M tokens and only account for the underlying cost of the model. Pricing for the model router is in addition to the model inference cost.

ModelIdentifierPrice / 1M Tokens (input) (USD)Price / 1M Tokens (output) (USD)
GPT-4ogpt-4o$5$15
GPT-4-Turbogpt-4-turbo$10$30
GPT-4gpt-4$30$60
GPT-3.5gpt-3.5-turbo$1$2
Claude 3 Opusclaude-3-opus$15$75
Claude 3 Sonnetclaude-3-sonnet$3$15
Claude 3 Haikuclaude-3-haiku$0.25$1.25
Claude 2.1claude-2.1$8$24
Claude 2claude-2$8$24
Claude Instant 1claude-instant-1$0.80$2.40
Command R+command-r-plus$3$15
Command Rcommand-r$0.50$1.50
Commandcommand$1.50$2.00
Command Lightcommand-light$0.30$0.60
Mistral Largemistral-large$8$24
Mistral Smallmistral-small$2$6
Mixtral 8x22b Instructmixtral-8x22b-instruct$0.90$0.90
Mixtral 8x7b Instructmixtral-8x7b-instruct$0.60$0.60
Mistral 7b Instructmistral-7b-instruct$0.20$0.20
Snowflake Arctic Instructsnowflake-arctic-instruct$2.40$2.40
DeepSeek Chatdeepseek-chat$0.14$0.28
DeepSeek Coderdeepseek-coder$0.14$0.28
Databricks DBRX Instructdbrx-instruct$1$1
WizardLM 2 8x22bwizardlm-2-8x22b$1$1
Llama 3 70b Instructllama-3-70b-instruct$0.90$0.90
Llama 3 8b Instructllama-3-8b-instruct$0.20$0.20
Llama 2 70b Chat HFllama-2-70b-chat-hf$1$2
Llama 2 13b Chat HFllama-2-13b-chat-hf$0.25$0.25
Llama 2 7b Chat HFllama-2-7b-chat-hf$0.2$0.2
CodeLlama 34b Instructcodellama-34b-instruct-hf$0.9$0.9
Nous Hermes 2 Yi 34Bnous-hermes-2-yi-34b$0.80$0.80
Deepseek Coder 33B Instructdeepseek-coder-33b-instruct$0.80$0.80
Phi 3 Mini 4k Instructphi-3-mini-4k-instruct$0.3$0.9
Phi 3 Mini 128k Instructphi-3-mini-128k-instruct$0.3$0.9
Phi 3 Medium 4k Instructphi-3-medium-4k-instruct$0.45$1.35
Phi 3 Medium 128k Instructphi-3-medium-128k-instruct$0.5$1.5
Google Gemma 7b Instructgoogle-gemma-7b-instruct$0.2$0.2
Google Gemma 2b Instructgoogle-gemma-2b-instruct$0.1$0.1