Inference Engines are designed to generate optimal LLM inference for their respective use-case. They have access to a carefully curated model selection and intelligently route queries to the best-suited LLM for each prompt. Maximize response quality while optimizing for cost and latency.