Tagging your queries

To create a tag you must send an actual query request where the model is denoted by tag:name-of-your-tag. The query will return an actual response which is using gpt-4-turbo by default.

from openai import OpenAI

client = OpenAI(
    base_url="https://router.neutrinoapp.com/api/engines",
    api_key="<Neutrino-API-key>"
)

response = client.chat.completions.create(
    # Instead of a specific model, denote a tag with the tag: keyword
    model="tag:name-of-tag",  # examples: "tag:coding_agent", "tag:chatbot2"
    messages = [
        {"role": "system", "content": "You are a helpful AI assistant. Your job is to be helpful and respond to user requests."},
        {"role": "user", "content": "What is a Neutrino?"},
    ],
)

print(f"Optimal model: {response.model}")
print(response.choices[0].message.content)

Using Neutrino Dashboard

Now you can go to platform.neutrinoapp.com to monitor queries and perform exploration to identify the best LLMs for your tag

Change response model

Your responses can be generated from a specific model of your choosing or the Neutrino Intelligent LLM Auto Router. By default your queries will be processed using GPT-4-Turbo

Exploration

Exploration will be triggered automatically once there are enough diverse queries collected. This roughly equates to around ~500 queries.

Selecting LLMs to explore

Before exploration is automatically triggered you can select which LLMs you would like to explore on the Exploration configuration tab

Customizing evaluation rubric

After responses are generated for all queries in the test bank, a custom evaluation rubric is created. You can edit this rubric to include or change metrics for the LLM-as-a-Judge system.

Starting LLM-as-a-Judge evaluations

You have to manually trigger the evaluation system in the exploration tab

Identifying the best LLMs for your use-case

Once the evaluations are done, you will recieve an email to see the results in the exploration tab.