Optimized Inference Engines
Inference Engines are designed to generate optimal LLM inference for their respective use-case. They have access to a carefully curated model selection and intelligently route queries to the best-suited LLM for each prompt. Maximize response quality while optimizing for cost and latency.
Supported Engines
- Chat Preview
- Code Preview
Chat Engine
The chat engine is general-purpose chat-interactions such as chatbots, support assistants, etc. It intelligently routes each query to one of the below models:
model selection:
- GPT-4-Turbo
- Claude 3 Sonnet
- Claude 3 Haiku
Code Engine
The code engine is optimized for coding-related use-cases such as code generation, coding copilots, code explanation, etc. It intelligently routes each query to one of the below models:
model selection:
- GPT-4-Turbo
- Claude 3 Sonnet
- Claude 3 Haiku
Usage
An engine is a collection of LLMs with a routing function that identifies the optimal model for each given query. You can treat an engine as a sort of ‘meta LLM’.