Selene: Frontier AI evaluation models
Get precise judgments on your AI app’s performance. Run evals with the Selene models, the most accurate LLM Judges on the market.
Run evals with
our LLM-as-a-Judge
Need to build trust with customers that your generative AI app is reliable? Judge your AI responses
with our evaluation models and receive scores and actionable critiques.
Selene models
Explore the right size and implementation methods
for your evaluation needs.
for your evaluation needs.
Optimized for speed
Selene 1 Mini
The best evaluation model of its size (8B). Suitable for running evals at inference time.
Industry-leading accuracy
Selene 1
The best model for evaluation on the market. Capable of accurately judging a wide variety of eval tasks, as well as adapting to custom eval criteria. Suitable for pre-production evals.
Cost
Intelligence
A new standard for AI evaluations
01
State-of-the-art models
Selene outperforms frontier models on commonly-used evaluation benchmarks, making it the most accurate and reliable model for evaluation.
02
Customize to your use case
Make your evals more fine-grained, format your score as you wish, and fit eval criteria to your use case with few-shots in our Eval Copilot (beta).
03
Accurate scores, actionable critiques
Designed for straightforward integration into existing workflows. Use our API to generate accurate eval scores with actionable critiques.
Read the blog post
Read the blog post
February 26, 2025
Introducing Selene 1: the world’s best LLM-as-a-Judge

Pricing plans
Free
Designed for hobbyists who want to start their project solo
Free credits per month:
1,000 free API calls (Selene)
3,333 free API calls (Selene Mini)
1,000 free API calls (Selene)
3,333 free API calls (Selene Mini)
Receive an evaluation score and a critique for each API call
Upgrade any time
Graduate to the next tier by adding your billing details
Key features
API access
Build your own metrics on Eval Copilot
SOCII report available upon request
Shared Slack channel
Support SLA
Rate limits
100 requests / minute
Pro
Designed for startups with AI applications in production
After monthly free credits:
$10 / 1K API calls (Selene)
$3 / 1K API calls (Selene Mini)
$10 / 1K API calls (Selene)
$3 / 1K API calls (Selene Mini)
Receive an evaluation score and a critique for each API call
5x higher rate limits
Monitor model outputs at production scale
Key features
API access
Build your own metrics on Eval Copilot
SOCII report available upon request
Shared Slack channel
Support SLA
Rate limits
500 requests / minute
Enterprise
Designed for teams with more security, deployment, and support needs
Enterprise grade security and support
Secure VPC peering, private deployments, dedicated endpoints, and 24/7 priority support
Scalable pricing
Pricing options that scale with your evaluation volume.
Custom rate limits
Key features
API access
Build your own metrics on Eval Copilot
SOCII report available upon request
Shared Slack channel
Support SLA
Rate limits
Custom
Boost your GenAI accuracy
Run evals with Selene 1 and Selene Mini
Custom eval metric deployment
using Eval Copilot (beta)
Free credits & usage-based pricing
Docs & guides